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The spike trains are the main components of the information processing in the brain. 
To model spike trains several point processes have been investigated in the literature. And 
more macroscopic approaches have also been studied, using partial differential equation 
models. The main aim of the present article is to build a bridge between several point 
processes models (Poisson, Wold, Hawkes) that have been proved to statistically fit 
real spike trains data and age-structured partial differential equations as introduced by 
Pakdaman, Perthame and Salort. 
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Introduction 

In Neuroscience, the action potentials (spikes) are the main components of the real¬ 
time information processing in the brain. Indeed, thanks to the synaptic integration, 
the membrane voltage of a neuron depends on the action potentials emitted by some 
others, whereas if this membrane potential is sufficiently high, there is production 
of action potentials. 


1 
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To access those phenomena, schematically, one can proceed in two ways: ex- 
tracellularly record in vivo several neurons, at a same time, and have access to 
simultaneous spike trains (only the list of events corresponding to action poten¬ 
tials) or intracellularly record the whole membrane voltage of only one neuron at a 
time, being blind to the nearby neurons. 

Many people focus on spike trains. Those data are fundamentally random and 
can be modelled easily by time point processes, i.e. random countable sets of points 
on K_|_. Several point processes models have been investigated in the literature, each 
of them reproducing different features of the neuronal reality. The easiest model is 
the homogeneous Poisson process, which can only reproduce a constant firing rate 
for the neuron, but which, in particular, fails to reproduce refractory period^ It 
is commonly admitted that this model is too poor to be realistic. Indeed, in such a 
model, two points or spikes can be arbitrary close as soon as their overall frequency 
is respected in average. Another more realistic model is the renewal process 1^, 
where the occurrence of a point or spike depends on the previous occurrence. More 
precisely, the distribution of delays between spikes (also called inter-spike intervals, 
ISI) is given and a distribution, which provides small weights to small delays, is 
able to mimic refractory periods. A deeper statistical analysis has shown that Wold 
processes is showing good results, with respect to goodness-of-fit test on real data 
sets 1^. Wold processes are point processes for which the next occurrence of a spike 
depends on the previous occurrence but also on the previous ISI. From another 
point of view, the fact that spike trains are usually non stationary can be easily 
modelled by inhomogeneous Poisson processes 1^. All those models do not reflect 
one of the main features of spike trains, which is the synaptic integration and there 
has been various attempts to catch such phenomenon. One of the main model is the 
Hawkes model, which has been introduced in and which has been recently shown 
to fit several stationary data^^. Several studies have been done in similar directions 
(see for instance^. More recently a vast interest has been shown to generalized 
linear models with which one can infer functional connectivity and which are 
just an exponential variant of Hawkes models. 

There has also been several models of the full membrane voltage such as 
Hodgkin-Huxley models. It is possible to fit some of those probabilistic stochas¬ 
tic differential equations (SDE) on real voltage data and to use them to estimate 
meaningful physiological parameters However, the lack of simultaneous data 
(voltages of different neurons at the same time) prevent these models to be used as 
statistical models that can be fitted on network data, to estimate network parame¬ 
ters. A simple SDE model taking synaptic integration into account is the well-known 
Integrate-and-Fire (IF) model. Several variations have been proposed to describe 
several features of real neural networks such as oscillations 011 In particular, there 
exists hybrid IF models including inhomogeneous voltage driven Poisson process 
that are able to mimic real membrane potential data. However up to our knowledge 


Biologically, a neuron cannot produce two spikes too closely in time. 








June 9, 2015 0:43 WSPC/INSTRUCTION FILE PDE'Hawkes'Mariell 


Microscopic approach of a time elapsed neural model 3 

and unlike point processes models, no statistical test have been applied to show 
that any of the previous variations of the IF model fit real network data. 

Both, SDE and point processes, approaches are microscopic descriptions, where 
random noise explains the intrinsic variability. Many authors have argued that there 
must be some more macroscopic approach describing huge neural networks as a 
whole, using PDE formalism EII121 Some authors have already been able to perform 
link between PDE approaches as the macroscopic system and SDE approach (in 
particular IF models) as the microscopic model IMMIH Another macroscopic point 
of view on spike trains is proposed by Pakdaman, Perthame and Salort in a series 
of articles pi|32|33[ ^ggg ^ nonlinear age-structured equation to describe the spikes 
density. Adopting a population view, they aim at studying relaxation to equilib¬ 
rium or spontaneous periodic oscillations. Their model is justified by a qualitative, 
heuristic approach. As many other models, their model shows several qualitative 
features such as oscillations that make it quite plausible for real networks, but once 
again there is no statistical proof of it, up to our knowledge. 

In this context, the main purpose of the present article is to build a bridge be¬ 
tween several point processes models that have been proved to statistically fit real 
spike trains data and age structured PDE of the type of Pakdaman, Perthame and 
Salort. The point processes are the microscopic models, the PDE being their meso- 
macroscopic counterpart. In this sense, it extends PDE approaches for IF models to 
models that statistically fit true spike trains data. In the first section, we introduce 
Pakdaman, Perthame and Salort PDE (PPS) via its heuristic informal and micro¬ 
scopic description, which is based on IF models. Then, in Section we develop 
the different point process models, quite informally, to draw the main heuristic 
correspondences between both approaches. In particular, we introduce the condi¬ 
tional intensity of a point process and a fundamental construction, called Ogata’s 
thinning 1^, which allows a microscopic understanding of the dynamics of a point 
process. Thanks to Ogata’s thinning, in Section we have been able to rigorously 
derive a microscopic random weak version of (PPS) and to propose its expecta¬ 
tion deterministic counterpart. An independent and identically distributed (i.i.d) 
population version is also available. Several examples of applications are discussed 
in Section To facilitate reading, technical results and proofs are included in two 
appendices. The present work is clearly just a first to link point processes and PDE: 
there are much more open questions than answered ones and this is discussed in the 
final conclusion. However, we think that this can be fundamental to acquire a deeper 
understanding of spike train models, their advantages as well as their limitations. 


1. Synaptic integration and (PPS) equation 

Based on the intuition that every neuron in the network should behave in the same 
way, Pakdaman, Perthame and Salort proposed inl^a deterministic PDE denoted 
(PPS) in the sequel. The origin of this PDE is the classical (IF) model. In this section 
we describe the link between the (IF) microscopic model and the mesoscopic (PPS) 
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model, the main aim being to show thereafter the relation between (PPS) model 
and other natural microscopic models for spike trains: point processes. 


1.1. Integrate-and-fire 


Integrate-and-fire models describe the time evolution of the membrane potential, 
V{t), by means of ordinary differential equations as follows 


Cra^=-gL{V-VL)+I{t), 


( 1 . 1 ) 


where Cm is the capacitance of the membrane, gr, is the leak conductance and Vl 
is the leak reversal potential. If V(t) exceeds a certain threshold 0, the neuron fires 
/ emits an action potential (spike) and V (t) is reset to Vl ■ The synaptic current 
I(t) takes into account the fact that other presynaptic neurons fire and excite the 
neuron of interest, whose potential is given by V(t). 

As stated in^^, the origin of (PPS) equation comes from^^, where the explicit 
solution of a classical IF model as (1.1) has been discussed. To be more precise the 
membrane voltage of one neuron at time t is described by: 


V{t) = Vr + (Vl - Vr)e CC-n. ^ J _ u)Ninput(du), (1.2) 

where Vr is the resting potential satisfying Vl < Vr < 0, T is the last spike emitted 
by the considered neuron, Tm is the time constant of the system (normally = 
ghjCm)-, h is the excitatory post synaptic potential (EPSP) and Ninput is the sum 
of Dirac masses at each spike of the presynaptic neurons. Since after firing, V(t) 
is reset to Vl < Vr, there is a refractory period when the neuron is less excitable 
than at rest. The constant time indicates whether the next spike can occur 
more or less rapidly. The other main quantity, h(t —u)Ninput (du), is the synaptic 
integration term. 

In 1^, they consider a whole random network of such IF neurons and look at the 
behavior of this model, where the only randomness is in the network. In many other 
studies EEMHEIIIIM! IF models as O are considered to finally obtain other sys¬ 
tems of partial differential equations (different to (PPS)) describing neural networks 
behavior. In these studies, each presynaptic neuron is assumed to fire as an indepen¬ 
dent Poisson process and via a diffusion approximation, the synaptic current is then 
approximated by a continuous in time stochastic process of Ornstein-Uhlenbeck. 


1.2. The (PPS) equation 

The deterministic PDE proposed by Pakdaman, Perthame and Salort, whose origin 
is also the microscopic IF model (1.21, is the following: 

dn{s,t) dn{s,t) 
ds 


(PPS) 


dt 


- p(s,X (t)) n (s, t) = 0 


m (t) := n (0, t) = p (s, X (t))n (s, t) ds. 


In this equation, n(s, t) represents a probability density of neurons at time t having 
discharged at time t — s. Therefore, s represents the time elapsed since the last 
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discharge. The fact that the equation is an elapsed time structured equation is 


natural, because the IF model (1.21 clearly only depends on the time since the last 
spike. More informally, the variable s represents the ’’age” of the neuron. 

The first equation of the system (PPS) represents a pure transport process and 
means that as time goes by, neurons of age s and past given by X(t) are either 
aging linearly or reset to age 0 with rate p{s,X (t)). 

The second equation of (PPS) describes the fact that when neurons spike, the 
age (the elapsed time) returns to 0. Therefore, n{0,t) depicts the density of neurons 
undergoing a discharge at time t and it is denoted by m{t). As a consequence of 
this boundary condition, for n at s = 0, the following conservation law is obtained: 

poo poo 

/ n{s,t)ds= / n(s,0)ds 

Jo Jo 

This means that if n(-,0) is a probabilistic density then can be interpreted 

as a density at each time t. Denoting by dt the Lebesgue measure and since m{t) is 
the density of firing neurons at time t in (PPS), m(t)dt can also be interpreted as 
the limit of Ninput{dt) in (1.21 when the population of neurons becomes continuous. 

The system (PPS) is nonlinear since the rate p{s,X{t)) depends on n(0,t) by 
means of the quantity X(t)\ 

t t 

X{t) = j h{u)m{t — u)du = J h{u)n{0,t — u)du. (1-3) 

0 0 

The quantity X{t) represents the interactions between neurons. It ’’takes into ac¬ 
count the averaged propagation time for the ionic pulse in this network” More 


precisely with respect to the IF models (1.21, this is the synaptic integration term. 


once the population becomes continuous. The only difference is that in (1.21 the 


memory is cancelled once the last spike has occurred and this is not the case here. 
However informally, both quantities have the same interpretation. Note neverthe¬ 
less, that inl^, the function h can be much more general than the h of the IF models 
which clearly corresponds to EPSP. From now on and in the rest of the paper, h is 
just a general non negative function without forcing the connection with EPSP. 

The larger p (s, A(t)) the more likely neurons of age s and past X{t) fire. Most of 
the time (but it is not a requisite), p is assumed to be less than 1 and is interpreted 
as the probability that neurons of age s fire. However, as shown in Section and 
as interpreted in many population structured equation ElElEll, p{s,X(t)) is closer 
to a hazard rate, i.e. a positive quantity such that p (s, X(t)) dt is informally the 
probability to fire given that the neuron has not fired yet. In particular, it could be 
not bounded by 1 and does not need to integrate to 1. A toy example is obtained 
if p{s,X{t)) = A > 0, where a steady state solution is n{s,t) = Ae“'^'’ls>o: this is 
the density of an exponential variable with parameter A. 

However, based on the interpretation of p{s,X{t)) as a probability bounded 
by 1, one of the main model that Pakdaman, Perthame and Salort consider is 


p{s,X{t)) = ls>cr(jf(t))- This again can be easily interpreted by looking at (1.2|. 
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Indeed, since in the IF models the spike happens when the threshold 9 is reached, 
one can consider that p(s,X{t)) should be equal to 1 whenever 

V{t) = K- + {Vl - + X{t) > 9, 

and 0 otherwise. Since Vl — K- < 0, p (s, X(t)) = 1 is indeed equivalent to s = t — T 
larger than some decreasing function of X (t). This has the double advantage to give 
a formula for the refractory period (CT(Jf(t))) and to model excitatory systems: the 
refractory period decreases when the whole firing rate increases via X(t) and this 
makes the neurons fire even more. This is for this particular case that Pakdaman, 
Perthame and Salort have shown existence of oscillatory behavior 1^. 

Another important parameter in the (PPS) model and introduced in is J, 
which can be seen with our formalism as f h and which describes the network 
connectivity or the strength of the interaction. In it has been proved that, for 
highly or weakly connected networks, (PPS) model exhibits relaxation to steady 
state and periodic solutions have also been numerically observed for moderately 
connected networks. The authors in have quantified the regime where relaxation 
to a stationary solution occurs in terms of J and described periodic solution for 
intermediate values of J. 

Recently, in^^, the (PPS) model has been extended including a fragmentation 
term, which describes the adaptation and fatigue of the neurons. In this sense, this 
new term incorporates the past activity of the neurons. For this new model, in 
the linear case there is exponential convergence to the steady states, while in the 
weakly nonlinear case a total desynchronization in the network is proved. Moreover, 
for greater nonlinearities, synchronization can again been numerically observed. 

2. Point processes and conditional intensities as models for spike 
trains 

We first start by quickly reviewing the main basic concepts and notations of point 
processes, in particular, conditional intensities and Ogata’s thinning We refer 
the interested reader to for exhaustiveness and to for a much more condensed 
version, with the main useful notions. 

2.1. Counting processes and conditional intensities 

We focus on locally finite point processes on M, equipped with the borelians S(]R). 

Definition 2.1 (Locally finite point process). A locally finite point process N 
on K is a random set of points such that it has almost surely (a.s.) a finite number 
of points in finite intervals. Therefore, associated to N there is an ordered sequence 
of extended real valued random times • • ■ < T_i < Tq < 0 < Ti < ■ ■ ■. 

For a measurable set A, denotes the number of points of N in A. This is a 
random variable with values in N U {oo}. 

Definition 2.2 (Counting process associated to a point process). The pro¬ 
cess on K+ defined hy 1Nt := A^(o,t] is called the counting process associated to 
the point process N. 
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The natural and the predictable filtrations are fundamental for the present work. 


Definition 2.3 (Natural filtration of a point process). The natural filtration 
of N is the family of cr-algebras defined by = a {N D (—oo,t]). 


Definition 2.4 (Predictable filtration of a point process). The pre¬ 
dictable filtration of N is the family of cr-algebra defined by = 

a {N n (— 00 , t)). 

The intuition behind this concept is that contains all the information given 
by the point process at time t. In particular, it contains the information whether 
t is a point of the process or not while only contains the information given 
by the point process strictly before t. Therefore, it does not contain (in general) 
the information whether t is a point or not. In this sense, represents (the 
information contained in) the past. 

Under some rather classical conditions 1^, which are always assumed to be sat¬ 
isfied here, one can associate to {Nt)t>o a stochastic intensity which is 

a non negative random quantity. The notation A(t, Jy^) for the intensity refers 
to the predictable version of the intensity associated to the natural filtration and 
{Nt — /p A(u, J^,^)du)t>o forms a local martingale 1^. Informally, A(t, Jy^)df repre¬ 
sents the probability to have a new point in interval [t, t + dt) given the past. Note 
that A(t, Jy^) should not be understood as a function, in the same way as density 
is for random variables. It is a ’’recipe” explaining how the probability to find a 
new point at time t depends on the past configuration: since the past configuration 
depends on its own past, this is closer to a recursive formula. In this respect, the 
intensity should obviously depend on N fl(— 00 , t) and not on Nfl (— 00 , t] to predict 
the occurrence at time t, since we cannot know whether t is already a point or not. 

The distribution of the point process N on K is completely characterized by the 
knowledge of the intensity A(t, Jy^) on IR+ and the distribution of = N H K_, 
which is denoted by Pq in the sequel. The information about Pg is necessary since 
each point of N may depend on the occurrence of all the previous points: if for all 
t > 0, one knows the ’’recipe” A(t, Jy^) that gives the probability of a new point at 
time t given the past configuration, one still needs to know the distribution of N_ 
to obtain the whole process. 

Two main assumptions are used depending on the type of results we seek: 


/ .Lba.s.X 

\^\,loc J 
\^\,loc J 


for any T >0, A(t, J'(^)df is finite a.s. 


is finite. 


for any T > 0, E A(t,Jy^)df 

Clearly implies ® ^ Note that implies non-explosion ir 

finite time for the counting processes (Nt). 


Definition 2.5 (Point measure associated to a point process). The point 
measure associated to N is denoted by N{dt) and defined by N{dt) — 
where is the Dirac mass in u. 


By analogy with (PPS), and since points of point processes correspond to spikes 
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(or times of discharge) for the considered neuron in spike train analysis, N (dt) is the 
microscopic equivalent of the distribution of discharging neurons m(t)dt. Following 
this analogy, and since is the last point less or equal to t for every t > 0, the 
age St at time t is defined hy St = t — Tn^. In particular, if t is a point of N, then 
St = 0. Note that St is measurable for every t > 0 and therefore, Sq = —Tq is 
measurable. To define an age at time t = 0, one assumes that 

(Ato) I There exists a first point before 0 for the process N_, i.e. —oo < Tg. 

As we have remarked before, conditional intensity should depend on TV D (—oo,t). 
Therefore, it cannot be function of St, since St informs us if t is a point or not. 
That is the main reason for considering this measurable variable 

St-=t-TN,_, ( 2 . 1 ) 

where Tn^_ is the last point strictly before t (see Figure]^. Note also that knowing 
{St-)t>o or {Nt)t>o is completely equivalent given . 

The last and most crucial equivalence between (PPS) and the present point 
process set-up, consists in noting that the quantities p{s,X{t)) and A(t, Jy^) have 
informally the same meaning: they both represent a firing rate, i.e. both give the 
rate of discharge as a function of the past. This dependence is made more explicit 
in p{s,X{t)) than in 

2.2. Examples 

Let us review the basic point processes models of spike trains and see what kind of 
analogy is likely to exist between both models ((PPS) equation and point processes). 
These informal analogies are possibly exact mathematical results (see Section . 
Homogeneous Poisson process This is the simplest case where A(t, Jy^) = A, 
with A a fixed positive constant representing the firing rate. There is no dependence 
in time t (it is homogeneous) and no dependence with respect to the past. This 
case should be equivalent to p{s,X{t)) = A in (PPS). This can be made even more 
explicit. Indeed in the case where the Poisson process exists on the whole real line 
(stationary case), it is easy to see that 

P {St- > s) = P {N[t-s,t) = O) = exp(-As), 

meaning that the age St- obeys an exponential distribution with parameter A, i.e. 
the steady state of the toy example developed for (PPS) when p{s,X{t)) = A. 
Inhomogeneous Poisson process To model non stationarity, one can use 
A(t, Jy^) = A(t), which only depends on time. This case should be equivalent to the 
replacement ofp(s,A(t)) in (PPS) by X{t). 

Renewal process This model is very useful to take refractory period into account. 
It corresponds to the case where the ISIs (delays between spikes) are independent 
and identically distributed (i.i.d.) with a certain given density v on K_|_. The asso- 
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dated hazard rate is 


/(s) = 


/(s) 


L °° Hx)dx 


when i'(x)dx > 0. Roughly speaking, /(s)ds is the probability that a neuron 
spikes with age s given that its age is larger than s. In this case, considering the set 
of spikes as the point process N, it is easy to show (see the Appendix |B.1[ ) that its 
corresponding intensity is A(t, = f{St-) which only depends on the age. One 
can also show quite easily that the process {St-)t>o, which is equal to {St)t>o almost 
everywhere (a.e.), is a Markovian process in time. This renewal setting should be 
equivalent in the (PPS) framework to p{s,X{t)) = f{s). 

Note that many people consider IF models with Poissonian inputs with or 
without additive white noise. In both cases, the system erases all memory after each 
spike and therefore the ISIs are i.i.d. Therefore as long as we are only interested by 
the spike trains and their point process models, those IF models are just a particular 
case of renewal process | 8 | io | i7 | 35 | 


Wold process and more general structures Let Aj. be the delay (ISI) between 
the last point and the occurrence just before (see also Figurej^ , Al = 

A Wold process mEH is then characterized by = f{St-,Al). This model 

has been matched to several real data thanks to goodness-of-fit tests and is 
therefore one of our main example with the next discussed Hawkes process case. 
One can show in this case that the successive ISPs form a Markov chain of order 1 
and that the continuous time process (iS”*-, A^) is also Markovian. 

This case should be equivalent to the replacement of p{s,X{t)) in (PPS) by 
/(s,a^), with denoting the delay between the two previous spikes. Naturally in 
this case, one should expect a PDE of higher dimension with third variable a^. 
More generally, one could define 

= TNt--{k-i) - TNt--k, (2.2) 

and point processes with intensity A(t,J^(^) = f{St-,Al,...,A^). Those processes 
satisfy more generally that their ISPs form a Markov chain of order k and that the 
continuous time process {St-, A ],..., A^) is also Markovian (see the Appendix |B.2| . 

Remark 2.1. The dynamics of the successive ages is pretty simple. On the one 
hand, the dynamics of the vector of the successive ages {St-,Al ,..., Aj^)t>o is de¬ 
terministic between two jumping times. The first coordinate increases with rate I. 
On the other hand, the dynamics at any jumping time T is given by the following 
shift: 

{ the age process goes to 0, i.e. St = 0, 
the first delay becomes the age, i.e. Ay_|_ = St-, 
the other delays are shifted, i.e. = A^^ for all i < k. 


(2.3) 
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Hawkes processes The most classical setting is the linear (univariate) Hawkes 
process, which corresponds to 




M + / h{t — u)N{du), 

J —OO 


where the positive parameter /i is called the spontaneous rate and the non negative 
function h, with support in K_|_, is called the interaction function, which is generally 
assumed to satisfy h < 1 to guarantee the existence of a stationary version 
This model has also been matched to several real neuronal data thanks to goodness- 
of-fit tests Since it can mimic synaptic integration, as explained below, this 
represents the main example of the present work. 

In the case where Tq tends to —oo, this is equivalent to say that there is no point 
on the negative half-line and in this case, one can rewrite 





h{t — u)N{du). 


By analogy between N{dt) and m{t)dt, one sees that /J h{t — u)N{du) is indeed 
the analogous of X{t) the synaptic integration in (1.3l. So one could expect that 
the PDE analogue is given by p{s,X{t)) = p + X{t). In Section]^ we show that 
this does not hold stricto sensu, whereas the other analogues work well. 

Note that this model shares also some link with IF models. Indeed, the formula 
for the intensity is close to the formula for the voltage (1.21, with the same flavor for 
the synaptic integration term. The main difference comes from the fact that when 
the voltage reaches a certain threshold, it fires deterministically for the IF model, 
whereas the higher the intensity, the more likely is the spike for the Hawkes model, 
but without certainty. In this sense Hawkes models seem closer to (PPS) since as we 
discussed before, the term p{s,X{t)) is closer to a hazard rate and never imposes 
deterministically the presence of a spike. 

To model inhibition (seefor instance), one can use functions h that may take 
negative values and in this case A(t, J^(^) = — u)N{du)^ , which 

should correspond to p{s,X{t)) = {p + X{t))^. Another possibility is = 


exp h{t — u)N{du)j , which is inspired by the generalized linear model as 

used by and which should correspond to p{s,X{t)) = exp {p + X{t)). 

Note finally that Hawkes models in Neuroscience (and their variant) are usually 
multivariate meaning that they model interaction between spike trains thanks to 
interaction functions between point processes, each process representing a neuron. 
To keep the present analogy as simple as possible, we do not deal with those mul¬ 
tivariate models in the present article. Some open questions in this direction are 
presented in conclusion. 
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2.3. Ogata’s thinning algorithm 

To turn the analogy between p{s,X{t)) and A(t, into a rigorous result on the 
PDE level, we need to understand the intrinsic dynamics of the point process. This 
dynamics is often not explicitly described in the literature (see e.g. the reference 
book by Bremaud because martingale theory provides a nice mathematical set¬ 
ting in which one can perform all the computations. However, when one wants to 
simulate point processes based on the knowledge of their intensity, there is indeed 
a dynamics that is required to obtain a practical algorithm. This method has been 
described at first by Lewis in the Poisson setting and generalized by Ogata in 
1^. If there is a sketch of proof in 1^, we have been unable to find any complete 
mathematical proof of this construction in the literature and we propose a full and 
mathematically complete version of this proof with minimal assumptions in the 
Appendix |B.4[ Let us just informally describe here, how this construction works. 

The principle consists in assuming that is given an external homogeneous Poisson 
process H of intensity 1 in and with associated point measure H (dt, da;) = 
S(r y)Gn'^(T,y)(dt, da;). This means in particular that 

E [n(dt, dx)] = dt dx. (2.4) 


Once a realisation of N_ fixed, which implies that is known and which can be 
seen as an initial condition for the dynamics, the construction of the process N on 
R_i_ only depends on H. 

More precisely, if we know the intensity A(t, J^(^) in the sense of the ’’recipe” 
that explicitly depends on t and A^n (—oo, t), then once a realisation of H and of N_ 
is fixed, the dynamics to build a point process N with intensity A(t, Jq^) for t G R+ 
is purely deterministic. It consists (see also Figure in successively projecting on 
the abscissa axis the points that are below the graph of A(t, Jy^). Note that a point 
projection may change the shape of A(t, Jy^), just after the projection. Therefore the 
graph of X{t,r^) evolves thanks to the realization of H. For a more mathematical 
description, see Theorem | B.ll| in the Appendix |B.4[ Note in particular that the 
construction ends on any finite interval [0,T] a.s. if holds. 

Then the point process N, result of Ogata’s thinning, is given by the union of 
N- on M_ and the projected points on M+. It admits the desired intensity X{t, 
on R_|_. Moreover, the point measure can be represented by 


lt>o N{dt) — 


(r,x)Gn / 
x < a ( t , x "_) 




n (dt, dx) 


(2.5) 


' X — {) 


NB: The last equality comes from the following convention. If (5(c,d) is a Dirac mass 
in (c, d) G K^, then f^^^6(c,d)(dt,dx), as a distribution in t, is Jc(dt) if d S [a, 6] 
and 0 otherwise. 
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Fig. 1. Example of Ogata’s thinning algorithm on a linear Hawkes process with interaction function 
h{u) = e““ and no point before 0 (i.e. N— = 0). The crosses represent a realization of 11, Poisson 
process of intensity 1 on The blue piecewise continuous line represents the intensity 
which starts in 0 with value fi and then jumps each time a point of 11 is present underneath it. 
The resulting Hawkes process (with intensity A(t, is given by the blue circles. Age St— at 

time t and the quantity Aj are also represented. 

3. Ftom point processes to PDE 

Let us now present our main results. Informally, we want to describe the evolution 
of the distribution in s of the age St according to the time t. Note that at fixed time 
t, St- = St a.s. and therefore it is the same as the distribution of St-. We prefer 
to study St- since its predictability, i.e. its dependence in n (—oo,t), makes all 
definitions proper from a microscopic/random point of view. Microscopically, the 
interest lies in the evolution of Sst_ (ds) as a random measure. But it should also be 
seen as a distribution in time, for equations like (PPS) to make sense. Therefore, 
we need to go from a distribution only in s to a distribution in both s and t. Then 
one can either focus on the microscopic level, where the realisation of 11 in Ogata’s 
thinning construction is fixed or focus on the expectation of such a distribution. 

3.1. A clean setting for bivariate distributions in age and time 

In order to obtain, from a point process, (PPS) system we need to define bivariate 
distributions in s and t and marginals (at least in s), in such a way that weak solu¬ 
tions of (PPS) are correctly defined. Since we want to possibly consider more than 
two variables for generalized Wold processes, we consider the following definitions. 

In the following, < (p,i/ > denotes the integral of the integrable function ip with 
respect to the measure v. 

Let fc G N. For every bounded measurable function ip of (t, s, oi,..., Ofc) G 
one can define 

ip[^\s,ai,...,ak) = ip(t,s,ai,...,ak) and ip^^^t,ai, ...,ak) = s, oi,..., Ufc). 
Let us now define two sets of regularities for ip. 
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■AJc,h(K 


k+2 

+ 


) 


The function cp belongs to if and only if 

• is a measurable bounded function, 

• there exists T > 0 such that for all t > T, = 0. 


The function ip belongs to if and only if 

• is continuous, uniformly bounded, 

• p has uniformly bounded derivatives of every order, 

• there exists T > 0 such that for all t > T, = 0. 

Let (j^i)(>Q be a (measurable w.r.t. t) family of positive measures on and 

(r'|)^>o be a (measurable w.r.t. s) family of positive measures Those families 

satisfy the Fubini property if 

{'PFubini)\ for any G A4c.f)(K+''^), J= J{‘Ps^\ 

In this case, one can define measure on by the unique measure on 

such that for any test function p in AJc,6(K+''^), 

<p,v>= j {pi^\i'l)dt = j {pf\v^ 2 )^s. 

To simplify notations, for any such measure z/(t,ds,dai, ...,da/c), we define 

v{t, ds, doi,..., dofc) = v\{ds, dai,..., da^), J^(dt, s, doi,..., do^) = r'Kdt, doi,..., da^). 

In the sequel, we need in particular a measure on rjx, defined for any real x 
by its marginals that satisfy (Vrubini) as follows 

Vt, s>0, ?7a;(t,ds) = Jt_3,(ds)lt_2,>o and 7?2,(dt, s) = (5s+a:(dt)ls>o- (3.1) 

It represents a Dirac mass ’’travelling” on the positive diagonal originated in {x, 0). 




3.2. The microscopic construction of a random PDE 


For a fixed realization of 11, we therefore want to define a random distribution 
U{dt,ds) in terms of its marginals, thanks to {Vrubini), such that, U{t,ds) repre¬ 
sents the distribution at time t > 0 of the age St-, i.e. 

Vt > 0, U{t, ds) = Ss,_ (ds) (3.2) 

and satisfies similar equations as (PPS). This is done in the following proposition. 


Proposition 3.1. Let 11, and an intensity (A(t, be given as in Section 


2.3 


and satisfying {Atq) cind event fl of probability 1, where 

Ugata’s thinning is well defined, let N be the point process on K that is constructed 
thanks to Ogata’s thinning with associated predictable age process {St-)t>o o-nd 
whose points are denoted (Ti)^^^- the (random) measure U and its corresponding 
marginals be defined by 

+ 00 

U (dt,ds) = ^77T,(dt,ds) lo<t<Ti+i- 

i=0 


(3.3) 
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Then, on fl, U satisfies (VFubini) and U(t,ds) = Ss-t_{ds). Moreover, on fl, U is a 
solution in the weak sense of the following system 


— U (di,ds) + —U (dt,ds) 





n (dt, da;) ) U (t, ds) = 0, 


(3.4) 


x—0 


[ Il{dt,dx)\ U {t,ds) + So{dt)tTo=o, (3-5) 

J X—0 J 


U{0,ds) = (5_To(ds)lTo<o = t/*"(ds)l,> 0 , (3.6) 

where U'''^{ds) = S-Toids). The weak sense means that for any (p G 


XM+ 


^ 5 \ 

— (p(t,s) + —(p (t, s)j U (dt, ds) + 


[ n(dLda;) ) t7(i,ds)+va(0,-To) = 0. (3.7) 

Jx=0 ) 


The proof of Proposition [XT] is included in Appendix ] A. 1[ Note also that thanks to 
the Fubini property, the boundary condition (13.5|) is satisfied also in a strong sense. 


System (3.4|-(3.6l is a random microscopic version of (PPS) if Tg < 0, where 
n{s,t) the density of the age at time t is replaced by U{t, •) = Sst_, the Dirac mass 
in the age at time t. The assumption Tg < 0 is satisfied a.s. if Tq has a density, but 
this may not be the case for instance if the experimental device gives an impulse 
at time zero (e.g.^^ studied Peristimulus time histograms (PSTH), where the spike 
trains are locked on a stimulus given at time 0). 

This result may seem rather poor from a PDE point of view. However, since this 
equation is satisfied at a microscopic level, we are able to define correctly all the 
important quantities at a macroscopic level. Indeed, the analogy between p(s, X(t)) 
and X{t, Xp) is actually on the random microscopic scale a replacement of p(s, X(t)) 
x(t P^ ) 

by Ij^g’ n(df, da;), whose expectancy given the past is, heuristically speaking, 
equal to X(t,iFff_) because the mean behaviour of H is given by the Lebesgue 
measure (see (H). Thus, the main question at this stage is : can we make this 
argument valid by taking the expectation of IT! This is addressed in the next section. 

The property {VFubini) and the quantities rjTi mainly allows to define U{dt,0) 


as well as U{t,ds). As expected, with this definition, (3.21 holds as well as 


f7(dt,0) = llt>o N{dt), 


(3.8) 


i.e. the spiking measure (the measure in time with age 0) is the point measure. 

Note also that the initial condition is given by , since fixes in particular 
the value of Tg and (Atq) is required to give sense to the age at time 0. To understand 
the initial condition, remark that if Tg = 0, then 17(0, •) = 0 limt_,,g+ U{t, •) = Jg 
by definitions of TyT^but that if Tg < 0, [/(O, •) = limt_,.o+ U{t, •) = J-Tq- 
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The conservativeness (i.e. for all t >0, U{t,ds) = 1) is obtained by using (a 
sequence of test functions converging to) (p = ^t<T- 


Proposition 3.1 shows that the (random) measure U, defined by (3.3), in terms 


of a given point process N, is a weak solution of System (3.4l-(3.6l. The study of 


the well-posedness of this system could be addressed following, for instance, the 
ideas given in^^. In this case U should be the unique solution of system (3.4|-(3.6l. 

As last comment about Proposition |3.H we analyse the particular case of the 
linear Hawkes process, in the following remark. 

Remark 3.1. In the linear Hawkes process, A(t, Jy^) = fJ. + h{t — z)N{dz). 


Thanks to (3.8) one decomposes the intensity into a term given by the initial condi¬ 
tion plus a term given by the measure U, = fj,+Fo{t)+h{t—z)U{dz, 0), 

where Fo(t) = h{t — z)N_{dz) is (J^()^)-measurable and considered as an initial 


condition. Hence, (3.4|-pJ3|) becomes a closed system in the sense that A(t, J^/i) is 


now an explicit function of the solution of the system. This is not true in general. 


3.3. The PDE satisfied in expectation 

In this section, we want to find the system satisfied by the expectation of the 
random measure U. First, we need to give a proper definition of such an object. The 
construction is based on the construction of U and is summarized in the following 
proposition. (The proofs of all the results of this subsection are in Appendix | A. 1[ ). 

Proposition 3.2. LetH, Tq and an intensity (A(t, be given as in Section 


2.3 


and satisfying {Ato ) and ^ process resulting of Ogata’s 

thinning and let U he the random measure defined by (3.3). Let E denote the expec¬ 


tation with respect to H and Tq . 

Then for any test function (p in AJc,b(®+ ), E [J (/9(t, s)?7(t, ds)] and 
E [/vj(<,s)t/(dt,s)] are finite and one can define u{t, ds) and u{dt, s) by 

Vt>0, J ip{t,s)u{t,ds) = ^ j 'p{t,s)U{t,ds) 


V s > 0, / p}{t, s)u{dt, s) = E 


(p{t, s)U{dt, s) 


Moreover, u{t,ds) and u{dt,s) satisfy {VFubini) and one can define it(dt, ds) = 
u{t, ds)dt = u{dt, s)ds on such that for any test function (p in Aic AK), 

J ip{t, s)u{dt,ds) = E, J p}{t, s)U{dt,ds) 

quantity which is finite. 

In particular, since J (p(t, s)u(t, ds) = E [J (p(t, s)l7(t,ds)'j = E [(p(t, St-)], u(t, ■) 
is therefore the distribution of St-, the (predictable version of the) age at time t. 
Now let us show that as expected, u satisfies a system similar to (PPS). 

Theorem 3.3. Let H, and an intensity {X{t, TA)) t>o A^en as in Section 



























June 9, 2015 0:43 WSPC/INSTRUCTION FILE PDE'Hawkes'Mariell 


16 J. Chevallier, M. Cdceres, M. Doumic, P. Reynaud-Bouret 


2.3 


and satisfying (-Aro) and /oc■ V ^ the process resulting of Ogata’s 
thinning, {St-)t>o its associated predictable age process, U its associated random 
measure, defined by (3.31, and u its associated mean measure, defined in Proposi¬ 
tion 


3.2 then, there exists a bivariate measurable function pa.Pq satisfying 

VT>0, f f px^Pg{t,s)u{dt,ds) < oo, 

Jo Js 

P\,Poit,s) = E [A (t,TlL)\St- = s] u{dt,ds)- a.e 
and such that u is solution in the weak sense of the following system 
d d 

—u {dt, ds) + —u (dt, ds) + px.Po {t, s)u {dt, ds) = 0, 
at os 

u{dt,0)= [ pA.Po(i, s)'w(t,ds) dt +Jo(dt)M“({0}), 

JsGR+ 

u(0,ds) = ii“(ds)ls>o, 
where u™ is the law of —Tq. The weak sense means here that for any (p G 


(3.9) 

(3.10) 

(3.11) 

(3.12) 




K+ Xlli+ 


If {t, s) u {dt, ds) + 

[(p{t,0) - (p{t,s)]px^Pg{t,s)u{dt,ds) d- [ :/5(0,s)u“(ds) = 0, (3.13) 

J ]R_l 


Comparing this system to (PPS), one first sees that n{-,t), the density of the 
age at time t, is replaced by the mean measure u{t,-). If u™ € L^(IR+) we have 
it*"({0}) = 0 so we get an equation which is exactly of renewal type, as (PPS). 
In the general case where u*" is only a probability measure, the difference with 
(PPS) lies in the term Jo(dt)w“({0}) in the boundary condition for s = 0 and in 
the term lLs>o in the initial condition for t = 0. Both these extra terms are linked 
to the possibility for the initial measure u*” to charge zero. This possibility is not 
considered in^^- else, a similar extra term would be needed in the setting of^^as 
well. As said above in the comment of Proposition |3.1| we want to keep this term 
here since it models the case where there is a specific stimulus at time zero^^. 

In general and without more assumptions on A, it is not clear that u is not only 
a measure satisfying (VFubini) but also absolutely continuous wrt to dt ds and that 
the equations can be satisfied in a strong sense. 

Concerning p{s,X{t)), which has always been thought of as the equivalent of 
A(t, Jy^), it is not replaced by A(t,Jy^), which would have no meaning in general 
since this is a random quantity, nor by E [A(t, Jy^)] which would have been a first 
possible guess; it is replaced by E [X{t, iF^)\St- = s]. Indeed intuitively, since 



■ /•A(t..p-A) 


E 

/ n(dt,da:) 

~oN 


Jx^O 



\{t,Xlf_)dt, 


the corresponding weak term can be interpreted as, for any test function (p, 














June 9, 2015 0:43 WSPC/INSTRUCTION FILE PDE'Hawkes'Mariell 


Microscopic approach of a time elapsed neural model 17 


E 

■ r ( \ 

n (di, dx) ) U (t, ds) 

= E 

[ T {t, s) A {t, E^_) 5st_ (ds)dt 


J \Jx^Q J 


J 


= [(p (L St-) E [A {t,riL) |5t_]] dt, 

which is exactly f (p(t, s)p\^pg(t, s)u(dt,ds). 

This conditional expectation makes dependencies particularly complex, but this 
also enables to derive equations even in non-Markovian setting (as Hawkes processes 
for instance, see Section]^. More explicitly, pA,Po(i;S) is a function of the time t, 
of the age s, but it also depends on A, the shape of the intensity of the underlying 
process and on the distribution of the initial condition N-, that is Pq. As explained 
in Section]^ it is both the knowledge of Pq and A that characterizes the distribution 
of the process and in general the conditional expectation cannot be reduced to 
something depending on less than that. In Section]^ we discuss several examples 
of point processes where one can (or cannot) reduce the dependence. 

Note that here again, we can prove that the equation is conservative by taking 
(a sequence of functions converging to) (p = lt<T as a test function. 

A direct corollary of Theorem |3.3| can be deduced thanks to the law of large 
numbers. This can be seen as the interpretation of (PPS) equation at a macroscopic 
level, when the population of neurons is i.i.d.. 


Corollary 3.4. Let be some i.i.d. point processes with intensity given by 

A(t,Jy^ ) on (0,+oo) satisfying and associated predictable age processes 

{Sl_)tyo. Suppose furthermore that the distribution of on (—oo,0] is given by 
Pq which is such that = 0) = 0. 

Then there exists a measure u satisfying {VFubini), weak solution of Equa¬ 
tions (3.101 and (3.11), with Pa,Po defined by 

P\,¥o (^) s) = E A (t, = s , u{dt, ds) - a.e. 

and with rt®" distribution of the age at time 0, such that for any ip G C)?^(K^) 


Vt>0, J p{t,s) J T(.t,s)u{t,ds), 


(3.14) 


In particular, informally, the fraction of neurons at time t with age in [s, s + ds) 
in this i.i.d. population of neurons indeed tends to M(t,ds). 


4. Application to the various examples 

Let us now apply these results to the examples presented in Section [2T2| 
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4.1. When the intensity only depends on time and age 

If A {t,F^_) = f {t, St-) (homogeneous and inhomogeneous Poisson processes and 
renewal processes are particular examples) then the intuition giving that p(s, X{t)) 
is analogous to works. Let us assume that f{t,s) G We have 


3.3 


E [a (t, X^-') |S't_ = s] = /(t, s). Under this assumption, we may apply Theorem 
so that we know that the mean measure u associated to the random process is a so¬ 


lution of System (3.101-(3.121. Therefore the mean measure u satisfies a completely 
explicit PDE of the type (PPS) with p\^Pg{t,s) = f{t,s) replacing p{s,X{t)). In 
particular, in this case pa.Po (ty s) does not depend on the initial condition. As al¬ 
ready underlined, in general, the distribution of the process is characterized by 
A(t,Jy^) = /(t, S't_) and by the distribution of N-. Therefore, in this special 
case, this dependence is actually reduced to the function / and the distribution of 
—Tq. Since /(•, •) G L°°([0,T] x M+), assuming also u™ G L^(M+), it is well-known 
that there exists a unique solution u such that (t i—>■ u{t,-)) G C([0,T],L^(M+)), 
see for instance Section 3.3. p.60. Note that following uniqueness for mea¬ 
sure solutions may also be established, hence the mean measure u associated to 


the random process is the unique solution of System (3.101-(3.121, and it is in 


C([0,r],L^(K+)): the PDE formulation, together with existence and uniqueness, 
has provided a regularity result on u which is obtained under weaker assumptions 
than through Fokker-Planck / Kolmogorov equations. This is another possible ap¬ 
plication field of our results: using the PDE formulation to gain regularity. Let us 
now develop the Fokker-Planck / Kolmogorov approach for renewal processes. 

Renewal processes The renewal process, i.e. when A = f (St-), with / a 

continuous function on K+, has particular properties. As noted in Section 2.2 the 


renewal age process {St-)t>o is an homogeneous Markovian process. It is known 
for a long time that it is easy to derive PDE on the corresponding density through 
Fokker-Planck / Kolmogorov equations, once the variable of interest (here the age) 
is Markovian (see for instance^. Here we briefly follow this line to see what kind of 
PDE can be derived through the Markovian properties and to compare the equation 


with the (PPS) type system derived in Theorem 3.3 

Since / is continuous, the infinitesimal generaJo^of (5't)t>o is given by 

{G(l)){x) = (t)'{x) + fix) ((^(0) - (l){x )), 


(4.1) 


for all (j) G (7^(11^+) (seel^. Note that, since for every t > 0 St- = St a.s., the process 
{St-)t>o is also Markovian with the same infinitesimal generator. 


*^The infinitesimal generator of an homogeneous Markov process {Zt)t>o the operator Q which 
is defined to act on every function (f> : K” —> R in a suitable space P by 


Gcf’ix) 


lim 

t-s-0+ 


E[b(Zt)|Zo = x] 
t 


<t>(x) 
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Let us now define for alH > 0 and all (p G (7^(1^+), 

Ptpix) = E = x\ = J p{s)u^(t,ds), 

where x G K+ and Mx(L •) is the distribution of St- given that Sq = x. Note 
that Ux(t,ds) corresponds to the marginal in the sense of {VFuMni) of Ux given by 


Theorem 3.3 with p\^Vo{t,s) = f{s) and initial condition Sx, i.e. Tq = —x a.s. 


In this homogeneous Markovian case, the forward Kolmogorov equation gives 

§-Pt = PtG. 

at 

Let p G (7^(11^) and let t > 0. This implies that 

d d 

— {Ptip{t,s)) = PtQ'-p{t,s) + Pt-^'f{t,s) 


= Pf 


d d 

— s) + /(s) (v3(L 0) - ip{t, s)) + s) 


Since p is compactly supported in time, an integration with respect to t yields 

-Po<p(0, = y -f* + J 0) - ip{t, s)) dt, 

or equivalently 

- (^( 0 , x) = J If {t, s) Ux {t, ds) dt - y {f{t, s) - fit, 0))f{s)ux{t, ds)dt, 

(4.2) 


in terms of Ux- This is exactly Equation (3.131 with u™ = Sx- 


The result of Theorem |3.3| is stronger than the application of the forward Kol¬ 
mogorov equation on homogeneous Markovian systems since the result of Theorem 
3. 3| never used the Markov assumption and can be applied to non Markovian pro¬ 


cesses (see Section 4.3). So the present work is a general set-up where one can deduce 
PDE even from non Markovian microscopic random dynamics. Note also that only 
boundedness assumptions and not continuity ones are necessary to directly obtain 


(4.2) via Theorem |3.3| to obtain the classical Kolmogorov theorem, one would have 
assumed / G C°(M^) rather than / G L°“(M^). 


4.2. Generalized Wold process 

In the case where A {t, P^_) = f{St-,Al ,..., A^), with / being a non-negative func¬ 
tion, one can define in a similar way Uk (t, s,ai,... ,ak) which is informally the 
distribution at time t of the processes with age s and past given by ai,...ak for 
the last k ISPs. We want to investigate this case not for its Markovian properties, 
which are nevertheless presented in Proposition | B.2| in the appendix for sake of 
completeness, but because this is the first basic example where the initial condition 
is indeed impacting pa.Pq Theorem 3.3 
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To do SO, the whole machinery applied on u{dt^ ds) is first extended in the next result 
to Uk (dt, ds, da^,... ,do^) which represents the dynamics of the age and the last 
k ISPs. This could have been done in a very general way by an easy generalisation 
of Theorem |3.3| However to avoid too cumbersome equations, we express it only 
for generalized Wold processes to provide a clean setting to illustrate the impact 
of the initial conditions on Pa,Po- Hence, we similarly define a random distribution 
C/fc(dt, ds, doi,... ,dafc) such that its evaluation at any given time t exists and is 

C/fc(t,ds,dai,... ,dafc) = (4.3) 

The following result states the PDE satisfied by u/j = E [Uk]- 


Proposition 4.1. Let k he a positive integer and f be some non negative func¬ 
tion on Let N be a generalized Wold process with predictable age process 

{St-)t>o, associated points (Ti)jgz and intensity X{t,iFff_) = f{St-,Al,...,A^) sat¬ 
isfying where A\,... ,A^ are the successive ages defined by ( |2.2[ ). Suppose 

that Po is such that Po(T_fe > —oo) = 1. Let Uk be defined by 

+00 k 

Uk (dt,ds,dai,... ,dafc) = rjT^ (dt, ds) n-'*, (doj) lo<t<Ti+i, (4.4) 

i=0 j=l 

If N is the result of Ogata’s thinning on the Poisson process H, then Uk satisfies 
(4.31 and {Vrubini) a-s- in H and . Assume that the initial condition u™, defined 
as the distribution of {—Tq,Aq, ... ,H§) which is a random vector in is such 

that u™({0} X K^) = 0. Then Uk admits a mean measure Uk which also satisfies 
{Ppubini) and the following system in the weak sense: on M+ x 

d d 

{^ + ^}Mfc(dt,ds,dai, ...,dafc) + /(s,ai,..., afe)Mfc(df,ds, dai, ...,dafc)= 0, 


Uk (dt,0,ds,dai, ...,dafe_i)= J /(s,ai, ...,afc) Ufe(t,ds,doi, ...,dafe) 


dt. 


Ofc=0 


Uk (0,ds,doi 


,dafc) = Mfc” (ds,dai,... ,dafc). 


(4.5) 

(4.6) 

(4.7) 


We have assumed m™({0}x]R^) = 0 (i.e. Tq ^ 0 a.s.) for the sake of simplicity, 
but this assumption may of course be relaxed and Dirac masses at 0 should then 
be added in a similar way as in Theorem |3.3| 

If / G L°°(K^+^), we may apply Proposition 


4.1 


so that the mean measure 


Uk satisfy System (|4.5|)-(4.7). Assuming an initial condition u]f G L’ 


), we 


can prove exactly as for the renewal equation (with a Banach fixed point argu¬ 
ment for instance) that there exists a unique solution Uk such that (t Uk(t, ■)) G 
C(K+, L^(K^~''^)) l^to the generalized Wold case, the boundary assumption on the 
kth penultimate point before time 0 being necessary to give sense to the successive 
ages at time 0. By uniqueness, this proves that the mean measure Uk is this solution, 
so that it belongs to C(K+, L^{ 
on the mean measure. 




)) : Proposition 4.1 leads to a regularity result 
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Now that we have clarified the dynamics of the successive ages, one can look 


at this system from the point of view of Theorem 3.3 that is when only two vari¬ 


ables s and t are considered. In this respect, let us note that U defined by (3.3) is 
such that U{dt,ds) = C/fe(df, ds, doi,... ,dafe). Since the integrals and the 

expectations are exchangeable in the weak sense, the mean measure u defined in 
Proposition 3.2 is such that ^(dLds) = Mfc(dt,ds,doi,...,dofc). But (4.5) 

in the weak sense means, for all (fi G 

^ ^ j ...,ak)uk (dt,ds,dai,... ,dafc) 

-I- J , Cfc) - ip{t, s, ai,..., at)] / (s, oi,..., Ofc) Uk {dt, ds, doi,..., duk) 

+ j (/j(0, s, oi,... ,afe) u™ (ds,dai,... ,dafc) = 0. (4.8) 

Letting ip G and (p G being such that 

we end up proving that the function pa.Pq defined in Theorem 3.3 satisfies 


P\,Fo{t,s)u{dt,ds) = f{s,ai,...,ak)uk{dt,ds,dai,...,dak), (4. 

J ai,...,afc 


9 ) 


w(dt,ds)—almost everywhere (a.e.). Equation (4.9) means exactly from a proba¬ 
bilistic point of view that 

Pa,Po(Ls) = E [f{St-,Al, ...,A'p)\St- = s] , u(dt,ds) - a.e. 

Therefore, in the particular case of generalized Wold process, the quantity pa.Pq 
depends on the shape of the intensity (here the function /) and also on Uk. But, 
by Proposition |4.1[ Uk depends on its initial condition given by the distribution of 
{—Tq,Aq, ..., A§), and not only —Tq as in the initial condition for u. That is, as 
announced in the remarks following Theorem 3.3 pa.Pq depends in particular on the 


whole distribution of the underlying process before time 0, namely Pq and not only 
on the initial condition for u. Here, for generalized Wold processes, it only depends 
on the last k points before time 0. For more general non Markovian settings, the 
integration cannot be simply described by a measure Uk in dimension (k + 2) being 
integrated with respect to da^...da^. In general, the integration has to be done on 
all the ’’randomness” hidden behind the dependence of with respect to 

the past once St- is fixed and in this sense it depends on the whole distribution 
Pq of N-. This is made even clearer on the following non Markovian example: the 
Hawkes process. 

4.3. Hawkes process 

As we have seen in Section |2.2[ there are many different examples of Hawkes pro¬ 
cesses that can all be expressed as A (t, h(t — x) iV(dx) j, where the 

main case is (p(0) — p + 0, for /i some positive constant, which is the linear case. 
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When there is no point before 0, A (t, (/q h{t — x) -/V(da:)^ . In this 

case, the interpretation is so close to (PPS) that the first guess, which is wrong, 
would be that the analogous in (PPS) is 


p{s,X{t)) = (t){X{t)), 


(4.10) 


where X{t) = E f* h(t — x)N(dx) = f*h(t — x)u(dx,0). This is wrong, even 
in the linear case since A (t, depends on all the previous points. Therefore Pa,Po 


defined by (3.91 corresponds to a conditioning given only the last point. 


By looking at this problem through the generalized Wold approach, one can 
hope that for h decreasing fast enough: 

A (t, — (j) (^h{St-) + h{St- + A\) + ... + h{St- + + ... + A^)^ . 

In this sense and with respect to generalized Wold processes described in the 
previous section, we are informally integrating on ’’all the previous points” except 
the last one and not integrating over all the previous points. This is informally 
why (4.101 is wrong even in the linear case. Actually, pa.Pq computable for linear 


Hawkes processes : we show in the next section that pa,Po (^j s) yf 4’iJ-ao ~ 
a::)u(dx, 0)) = /i + h{t — x)u{dx, 0) and that pa.Pq explicitly depends on Pq. 


4.3.1. Linear Hawkes proeess 

We are interested in Hawkes processes with a past before time 0 given by , which 
is not necessarily the past given by a stationary Hawkes process. To illustrate the 
fact that the past is impacting the value of pa.Pq? focus on two particular cases: 
I — {^o} a.s. and Tq admits a bounded density /g on K_ 

I an homogeneous Poisson process with intensity a on M_ 

Before stating the main result, we need some technical definitions. Indeed the 
proof is based on the underlying branching structure of the linear Hawkes process 


described in Section B.3.1 of the appendix and the following functions (Lg, Gg) are 


naturally linked to this branching decomposition (see Lemma B.7l. 


Lemma 4.2. Let h G L^(K+) such that ||h||/,i < 1. For all s > 0, there exist a 
unique solution {Lg,Gg) G L^(K+) x L°°(IR+) of the following system 

/•(x—s)V0 px 

log(Gs(a:)) = / G g{x — w)h{w)dw — / h{w)dw, (4-11) 

Jo Jo 

Ls{x) = f {h (w) + Lg{w)) Gg{w)h{x — w) dw, (4-12) 

J sAx 

where a V & (resp. a f\b) denotes the maximum (resp. minimum) between a and b. 
Moreover, Lg{x < s) = 0, Gs : K+ —)■ [0,1], and Lg is uniformly bounded in L^. 
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This result allows to define two other important quantities, Ks and g, by, for all 
s, t > 0, z G K, 


/•(i—s)V0 

Ks{t^ z) := / [h{t — x) Ls{t — x)] Gs{t — x)h{x — z)dx, 

Jo 

rt i‘{t — s)'V0 

log{q(t,s,z)):=— / h{x — z)dx — / [1 — Gs{t — x)]h{x — z)dx. (4.13) 

J(t-sWo Jo 


Finally, the following result is just an obvious remark that helps to understand the 
resulting system. 


Remark 4.1. For a non negative 4> G L°°(]R+) and G L°°(K+), there exists a 
unique solution v G the weak sense to the following system, 

s) + s) + s)v{t, s) = 0, (4.14) 

z)(t, 0) = 1 z;(t = 0, s) = v*"(s) (4-15) 

Moreover t i-G v{t, .) is in C{R+, . 

If r;®” is a survival function (i.e. non increasing from 0 to 1), then v(t, .) is a 
survival function and —dsV is a probability measure for all t > 0. 


Proposition 4.3. Using the notations of Theorem \3.3[ let N be a Hawkes proeess 
with past before 0 given by 7V_ satisfying either (^Al^fjor (^A%_ ^ and with intensity 
on K_|_ given by 

\{t,iFif) = qL+ f h{t — x)N{dx), 

J —OO 

where p, is a positive real number and h G L°°(K+) is a non-negative function with 
support in K+ such that J h < 1. 

Then, the mean measure u defined in Proposition\3.S\ satisfies Theorem T3 and 


moreover its integral v{t, s) := f u(t, da) is the unique solution of the system (4.14)- 

S 

(4.15) where u®” is the survival function of—T q, and where <i> = G L°°(M+) is 
defined by 


^Pn — +^-,Po> 


where for all non negative s,t 


^^^{t,s) = [ {h{x) + Lfix))Gs{x)d^ 

\ JsAt 

and where under Assumption ^, 


(4.16) 

(4.17) 


4’-,Po(t) s) — 


/: 


0A(t—s) 


{h{t - to) + Ks{t, to)) q{t, s, to)fo{to)dto 


/: 


0A{t—s) 


q{t,s,to)fo{to)dto 


(4.18) 
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or, under Assumption 

/ 0A{t — s) 

{h{t - z)+ Ks{t, z)) q{t, s, z)dz. (4.19) 

-OO 


In these formulae, Lg, Gg, Kg and q are given by Lemma ^.2 and (4.131. Moreover 

^+oo 


/•+00 p+oo 

Vs>0, / pA,Po(La;)u(t,da:) = / u{t,dx). 

J s J s 


(4.20) 


The proof is included in Appendix |B. 3 1 Proposition |4 .3 1 gives a purely analytical 
definition for v, and thus for u, in two specific cases, namely or 

In the general case, treated in Appendix B (Proposition B.5|, there remains a 
dependence with respect to the initial condition Pq, via the function p^. 


Remark 4.2. Contrarily to the general result in Theorem |3.3| Proposition |4.3| 
focuses on the equation satisfied by u(dt, s) = M(dt, dx) because in Equa¬ 
tion (4.14) the function parameter J) = <i)p’^ may be defined independently of the 


definitions of v or u, which is not the case for the rate pa.Pq appearing in Equa¬ 


tion (3.10). Thus, it is possible to depart from the system of equations defining v, 


study it, prove existence, uniqueness and regularity for v under some assumptions 
on the initial distribution u*" as well as on the birth function h, and then deduce 
regularity or asymptotic properties for u without any previous knowledge on the 
underlying process. 

In Sections |4.1| and |4.2| we were able to use the PDE formulation to prove that the 
distribution of the ages u has a density. Here, since we only obtain a closed formula 


for v and not for u, we would need to derive Equation (4.14) in s to obtain a similar 
result, so that we need to prove more regularity on <I)p^ . Such regularity for 
is not obvious since it depends strongly on the assumptions on N_. This paves the 
way for future research, where the PDE formulation would provide regularity on 
the distribution of the ages, as done above for renewal and Wold processes. 

Remark 4.3. These two cases and highlight the dependence with 

respect to all the past before time 0 (i.e. Pq) and not only the initial condition (i.e. 
the age at time 0). In fact, they can give the same initial condition u™ : for instance, 
^ with —To exponentially distributed with parameter a > 0 gives the same 

law for —To as ^ with parameter a. However, if we fix some non-negative real 

number s, one can show that Po(0j s) is different in those two cases. It is clear 
from the definitions that for every real number z, q{0,s,z) = 1 and Ks(0,z) = 0. 
Thus, in the first case. 




/_^ h{-to)ae°‘*°dto h{z)ae~°‘^dz 


r. 


‘odtn 


/; 


ae 


^dz 
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while in the second case, (0, s) = a h(—z)dz = a Jf h{w)dw. Therefore 
4*^ Pjj clearly depends on Pq and not just on the distribution of the last point before 
0, and so is pa.Pq- 


Remark 4.4. If we follow our first guest, Pa,Po would be either fj, + f* h{t — 
x)u{dx, 0) or /r + h{t — x)u{dx, 0). In particular, it would not depend on the age 

s. Therefore by ( |4.20 1, so would But for instance at time t = 0, when iV_ is 
an homogeneous Poisson process of parameter a, <i)p^^(0, s) = fj, + a h(w)dw, 
which obviously depends on s. Therefore the intuition linking Hawkes processes and 
(PPS) does not apply. 


4.3.2. Linear Hawkes process with no past before time 0 

A classical framework in point processes theory is the case in ^ where Tg —)■ 

—oo, or equivalently, when N has intensity X{t,iFfL) = /x + f h(t — x)N(dx). The 
problem in this case is that the age at time 0 is not finite. The age is only finite for 
times greater than the first spiking time Ti. 

Here again, the quantity v(t, s) reveals more informative and easier to use: having 
the distribution of Tg going to —oo means that Supp(M*") goes to +oo, so that the 
initial condition for v tends to value uniformly 1 for any 0 < s < +oo. If we 
can prove that the contribution of vanishes, the following system is a good 

candidate to be the limit system: 

(t, s) + (t, s) + (t, s) (t, s) = 0, (4.21) 

z;°°(t,0) = l, ^;°°(0,s) = l, (4.22) 


where 4)p ^ is defined in Proposition 4.3 This leads us to the following proposition. 


con- 


Proposition 4.4. Under the assumptions and notations of Proposition 
sider for al l M > 0, vm the unique solution of system (4.141-(4.151 with 4) given by 
case withTo uniformly distributed in [—M—1,—M]. Then, 


4.3 


Proposition 

as M goes to infinity, vm converges uniformly on any set of the type (0,T) x (0,5) 
towards the unique solution v°° of System (4.211-(4.221. 


Conclusion 

We present in this article a bridge between univariate point processes, that can 
model the behavior of one neuron through its spike train, and a deterministic age 
structured PDE introduced by Pakdaman, Perthame and Salort, named (PPS). 
More precisely Theorem |3.3| present a PDE that is satisfied by the distribution u 
of the age s at time t, where the age represents the delay between time t and the 
last spike before t. This is done in a very weak sense and some technical structure, 
namely {VFubim), is required. 
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The main point is that the "firing rate” which is a deterministic quantity written 
as p{s,X{t)) in (PPS) becomes the conditional expectation of the intensity given 
the age at time t in Theorem 3.3 This first makes clear that p{s,X{t)) should 
be interpreted as a hazard rate, which gives the probability that a neuron fires 
given that it has not fired yet. Next, it makes clearly rigorous several ’’easy guess” 
bridges between both set-ups when the intensity only depends on the age. But it 
also explained why when the intensity has a more complex shape (Wold, Hawkes), 
this term can keep in particular the memory of all that has happened before time 0. 

One of the main point of the present study is the Hawkes process, for which 
what was clearly expected was a legitimation of the term X(t) in the firing rate 
p{s,X{t)) of (PPS), which models the synaptic integration. This is not the case, 
and the interlinked equations that have been found for the cumulative distribution 
function v(t, •) do not have a simple nor direct deterministic interpretation. However 
one should keep in mind that the present bridge, in particular in the population wide 
approach, has been done for independent neurons. This has been done to keep the 
complexity of the present work reasonable as a first step. But it is also quite obvious 
that interacting neurons cannot be independent. So one of the main question is: can 
we recover (PPS) as a limit with precisely a term of the form X{t) if we consider 
multivariate Hawkes processes that really model interacting neurons ? 
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A. Proofs linked with the PDE 
A.l. Proof of Proposition \3.1\ 


First, let us verify that U satisfies Equation (3.2). For any t > 0, 


U{t,ds) = '^r]Ti{t,ds)lo<t<Ti+i, 

i>0 


by definition of U. Yet, riTi{t,ds) = 5t-Ti{ds)lt>Ti, and the only z G N such that 
Ti < t < Ti_|_i is z = Nt-. So, for all t > 0, U{t,ds) = St-TN^_ (ds) = Jst_(ds). 
Secondly, let us verify that U satisfies (VFubini)- Let tp G Alc,t)(IR+), and let T be 
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such that for all t > T, ipl ' = Q. Then since U{t,ds) = VTi(t,ds)lo<t<Ti+i, 

( f I dt < f if |v?(t,s)| V77Ti(t,ds)lo<t<Ti+i I dt 

= X! / \‘f{t,t-Tt)\tt>Ti^o<t<Ti+idt = '^ [ \>f{t,t-Ti)\dt 

i>0 "'max(0,Ti) 


i>0 


rTi 


\ip{t,t-To)\+ ^ / \ip{t,t-T,)\dt. 


i/0<Ti<T 

Since there is a finite number of points of N between 0 and T, on f2, this quantity 
is finite and one can exchange X)i>o Therefore, since all the r/Ti 

satisfy {VFubini) and s)lo<t<Ti+i is in AJc,h(K+), so does U. 

For the dynamics of U, similar computations lead for every ip G C^(K+^) to 

fif{t,s)U{dt,ds) = '^f (p{s+ Ti,s)ds. 

J Jmax(0,-Ti) 


We also have 


I + I) ^1/(dM.) = E^ (I + I k(. + r.,,)d. 


= ^ fy(T*+i,T,+i - fy) - (fy, 0)] + - To) - ^(0, -To). (A.l) 

i>l 

x(t ) 

It remains to express the term with 11 (dt, dx) = X]i>o '^T+i (d^)) that is 

J T {t, s) U {t, ds) ^ Jt.+i (dt) = f (J ip{t,s)U {t, ds)^ ^ Jt.+i (dt) 

= f if {t, St-) Y, (dt) = E m+i,T^+i - Ti ), (A.2) 

2>0 i>0 

and, since f U (t,ds) = 1 for all i > 0, 

J J ^it,0)U (t, ds) ^ (dt) = ^ (T,+ 1 .0), (A.3) 


i>0 


i>0 


Identifying all the terms in the right-hand side of Equation (A.l I, this lead to 


Equation (3.7l, which is the weak formulation of System (3.4)-(3.6). 


A.2. Proof of Proposition 

Let ip G A1c, 6(IR+), and let T be such that for all t > T, ip\^'^ — 0. Then, 


fy(t,s)|;7(t,ds) < |fy|U“lo<t<T, 


(A.4) 
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since at any fixed time t > 0, JU(t,ds) = 1. Therefore, the expectation 
E [J (p(t, s)[/(f, ds)] is well-defined and finite and so u(t ,.) is well-defined. 

On the other hand, at any fixed age s, 

« OO 

I l‘^(Ls)|t^(dt,s) = ^|v5(s-krj,s)|lo<s<Ti+i-Ti 
2 = 0 

= ^ l¥’(s + Ti, s)|lo<s-|-Ti<Tlo<s<Ti+i-Ti, 

i>0 

because for all t > T, = 0. Then, one can deduce the following bound 

J {‘fit, s)\U{dt,s) 

< |:/3(s -I- Tq, s)|11_To<s<T-To1o<s<Ti-To + ^ 1 ^ 5(5 + Ti, s)|llo<s<TlTi<T 

i>l 

< IIV^IIl” (1-T(,<s<T-To + .^t1o<s<t) • 


Since the intensity is in expectation, E [IVt’] = E fj' A(t,J^^)dt 


<00 and 


E 


|(/5(t,s)|;7(dt,s) 


< I \v\ \l°° (E [1_To<s<t-To] + E [Nt] 1o<s<t) , (A.5) 


so the expectation is well-defined and finite and so m(-, s) is well-defined. 


Now, let us show {VFuUni)- First Equation (A.4| implies 


E 


\(p{t,s)\U{t,ds) 


dt < T\\ip\\l<^, 


and Fubini’s theorem implies that the following integrals are well-defined and that 
the following equality holds. 


E 


ip{t, s)U{t, ds) 


dt = E 


J J (p{t, s)U{t,ds)dt 


(A.6) 


Secondly, Equation (A.51 implies 


E 


|V3(t,s)|t/(dt,s) 


ds< (T + TE[iVT]), 


by exchanging the integral with the expectation and Fubini’s theorem implies that 
the following integrals are well-defined and that the following equality holds. 


E 


(p{t, s)U{dt, s) 


ds = E 


J J (p{t, s)U{dt, s)ds 


(A.7) 


Now, it only remains to use {VFubini) for U to deduce that the right members of 


Equations (A.6) and (A.7) are equal. Moreover, {VFubini) for U tells that these two 
quantities are equal to E J J (p(t, s)[/(dt, ds)]. This concludes the proof. 
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A.3. Proof of Theorem \3.3\ 

)1|3 __a<J 

Let s) := liminfe^o — — p(|g^ -s|<e) ~ ^ ■> ^ > 0 and s > 0. Since 

and (5t_) t>o are predictable processes, and a fortiori progressive 
processes (see page 9 in^^, pa,Po is a measurable function of (t, s). 

For every t > 0, let be the measure defined by ^t(A) = E [A(t, 
for all measurable set A. Since Assumption implies that dt-a.e. 

E[A(t,J^(^)] < oo and since u(t,ds) is the distribution of St-, fit is absolutely 
continuous with respect to u(t, ds) for dt-almost every t. 

Let ft denote the Radon Nikodym derivative of ^t with respect to u(t,ds). 
For M(t,ds)-a.e. s, ft{s) = E [A(t, St- = s] by definition of the conditional 
expectation. Moreover, a Theorem of Besicovitch claims that for u(t,ds)-a.e. 
s, ft{s) = pA.Po(i,s). Hence, the equality/9A,Po(i, s) = E [A(t, St- = s] holds 
u{t,ds)dt = u(dt, ds)-almost everywhere. 

Next, in order to use {VFubini), let us note that for any T,K >0, 

Pxfo ■ (^> s) (PX,Po(^! s) lo<t<T G AJc.f)(E+) (A.8) 


Hence, f J pff(t,s)u(dt,ds) = f (^f p^f{t,s)u(t,ds)j dt which is always upper 
bounded by jf (f pA^Pg(t, s)u(t,ds)) dt = jf pt(JS.+ )dt = jfE [A(t, dt < oo. 
Letting K ^ oo, one has that jf f px^Pg(t, s)u(dt,ds) is finite for all T > 0. 
Once pa,Po correctly defined, the proof of Theorem 3.3 is a direct consequence 
of Proposition |3.1| 


More precisely, let us show that (3.7) implies (3.13). Taking the expectation 
of ^ gives that for all p G 


E 


[p {t,s) - :p(<,0)] 




H (dt, dx) ) U {t, ds) 


! x—0 


- / (0, s) (ds) 


- J {dt + ds)p{t,s)u{dt,ds) = 0. (A.9) 


Let us denote ip{t, s) := p{t, s) — p{t, 0). Due to Ogata’s thinning construction, 

7iF">n(d,, 

thinning, and so. 


/ \f-f- jtN' a 

( fx=o n(dt,dx) ) = A(dt)lt>o where N is the point process constructed by 


E 


ip{t,s) 


/x=0 
7N\ 


H (dt,da:)J U (t,ds) 

But 'ip{t,St-) is a (Jy^)-predictable process and 
E 


= E 


iP{t,St-)N{dt) 


ut>o 


(A.IO) 


[ m,St-)\X{t,Pi'-)dt 

< Ml^ e 

r x{t,Pt^_)dt 

Jt>0 


Jo 


< oo. 
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hence, using the martingale property of the predictable intensity, 


E 




Ut>o 


= E 


Jt>0 


(A.ll) 


Moreover, thanks to Fubini’s Theorem, the right-hand term is finite and equal to 
/ Eltp {t, St-) A(t, Jy^)]dt, which can also be seen as 


J E[ip{t,St-) px,Po{t,St-)]dt = J tp{t,s)px,Poi't^s)u{t,ds)dt. (A. 12) 

For all K >0, ((t, s) s) {px,Po(t, s) A K)) e A4c,b(R+) and, from {VFubini), it 

is clear that /s) (pA,Po(ti s) A K) u{t, ds)dt = Jtp{t, s) {px,Po{t: s) A K) u{dt,ds). 
Since one can always upper-bound this quantity in absolute value by 
IIV’IU“/o IsPK Pg(t, s)rt(dt, ds), this is finite. Letting AT —> cx) one can show that 


'ilj{t,s)px,Po{t,s)u{t,ds)dt = / V’(t, s)PA.Po(t, s)u(dt,ds). 


(A.13) 


Gathering (A.IO l-( A.131 with (A.9l gives (3.13). 


A.4. Proof of Corollary \3.^ 

For all i G N*, let us denote = A® n (0,+oo) and A1 = A® fl IR_. Thanks 
to Proposition B.12 the processes A^ can be seen as constructed via thinning 
of independent Poisson processes on K^. Let (n®)igN be the sequence of point 
measures associated to independent Poisson processes of intensity 1 on given 
by Proposition B.12 Let Tq denote the closest point to 0 in JVf. In particular, 
(W igN* is a sequence of i.i.d. random variables. 

For each i, let t/® denote the solution of the microscopic equation corresponding 


to n® and Tg as defined in Proposition 3.1 by (3.3). Using (3.2), it is clear that 
Z]r=i ^ for all t > 0. Then, for every ip G C)fl(Rl), 

The right-hand side is a sum n i.i.d. random variables with mean J (p(t, s)u(t, ds). 


so (3.14) clearly follows from the law of large numbers. 


B. Proofs linked with the various examples 
B.l. Renewal process 


Proposition B.l. With the notations of Section^ let N be a point process on M, 
with predictable age process {St-)t>o, such thatT^ = 0 a.s. The following statements 
are equivalent: 

(i) A_|_ = An (0,-|-oo) is a renewal process with ISI’s distribution given by some 
density v : K+ —>■ K+. 
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(a) N admits = f{St-) as an intensity on (0,+(X)) and (A(L 

satisfies iof some f : IR+ —>■ IR+. 

In sueh a ease, for all x > 0, f and v satisfy 

r 

• v{x) = f{x)exp{— / f{y)dy) with the convention exp(—oo) = 0, (B.l) 

Jo 


f{x) = 


v{x) 


ir ^iy)dy 


nOO 

if / v{y)dy 7^ 0: else f{x) = 0. 
J X 


(B.2) 


Proof. For (ii) (i). Since Tq = 0 a.s., Point (2) of Proposition B.2 given later 


on for the general Wold case implies that the ISPs of N forms a Markov chain of 
order 0 i.e. they are i.i.d. with density given by (23. 

For (i) (ii). Let Xq = inf{x > 0,/^ ’^{y)dy = 0}. It may be infinite. Let us 

define / by (B.21 for every 0 < x < xq and let iV be a point process on M such 
that N- = N- and N admits X{t,iFff_) = f{Sff) as an intensity on (0, +oo) where 
(sfL )t>o is the predictable age process associated to N. Applying (ii) (i) to N 
gives that the ISPs of N are i.i.d. with density given by 


z>(x) = 


n{x) 


exp - 


I'iy) 


-dy 


CHy)dy "V ^0 

for every 0 < x < xq and i'(x) = 0 for x > xq. It is clear that v = v since the function 
X I—>■ foo ' — exp I — —dy | is differentiable with derivative equal to 0. 

uiy)dy ^ JO ^(z}dz ^^ 

Since N and N are renewal processes with same density v and same first point 
To = 0, they have the same distribution. Since the intensity characterizes a point 
process, N also admits X{t,iFff) = f{Sf!_) as an intensity on (0,+oo). Moreover, 
since iV is a renewal process, it is non-explosive in finite time and so (A(t, T)^)) 

satisfies □ 


B.2. Generalized Wold processes 


In this Section, we suppose that there exists fc > 0 such that the underlying point 
process N has intensity 


X{t,rif) = fiSt.,Al,...,A^), 


where / is a function and the A*’s are defined by Equation (2.2). 


(B.3) 


B.2.1. Markovian property and the resulting PDE 


Let be a point process of intensity given by (B.3l. If T_fc > —oo, its associated 
age process {St)t can be defined up to t > T_fe. Then let, for any integer i > —k, 

A, = T ,+1 -r = (B.4) 




















June 9, 2015 0:43 WSPC/INSTRUCTION FILE PDE'Hawkes'Mariell 


32 J. Chevallier, M. Cdceres, M. Doumic, P. Reynaud-Bouret 


and denote (F^)i>-k the natural filtration associated to (Ai)i>_fe. 

For any t > 0, and point process U on let us denote n>t (resp. n>t) the 
restriction to (resp. (0,+oo) x IR+) of the point process U shifted t time units 
to the left on the first coordinate. That is, n>t(C' x D) = n((t + C) x D) for all 
C G B{R+),Dg B(R+) (resp. C G ^((0,+oo))”). 


Proposition B.2. 

Let consider k a non-negative integer, f some non negative function on 
and N a generalized Wold process of intensity given by (B.3l. Suppose that Pg is 
such that¥o{T-k > —oo) = 1 and that (A(t, satisfies (oc* ) ■ Then, 


(1) If {Xt)t>o = ((‘5'i-) 5 then for any finite non-negative stop¬ 

ping time T, {Xf)t>o = {Xt+r)t>o is independent of iFff_ given X^- 

(2) the process given by ( |B.4[ ) forms a Markov chain of order k with 

transition measure given by 


iy{dx,yi,...,yk) = f{x,yi,...,yk)exp 
//To = 0 a.s., this holds for (Ai)i>o. 


fiz,yi,...,yk)dz] dx. (B.5) 


If f is continuous then Q, the infinitesimal generator of {Xt)t>o, is given by 


V/)GC1(K^+1), (0(^)(s,ai,...,afc) = 

d 

z^4>is, ai, ..., Ofc) + /(s, ai, ..., Ofe) ((/(O, s, oi,..., Uk-i) - (t){a, ai, ..., ak)) ■ (B.6) 


Proof. First, let us show the first point of the Proposition. Let U be such that N 
is the process resulting of Ogata’s thinning with Poisson measure U. The existence 
of such a measure is assured by Proposition | B.12[ We show that for any finite 
stopping time t , the process {Xf)t>o can be expressed as a function of Xr and 
n>,- which is the restriction to of the Poisson process U shifted r time units 
to the left on the first coordinate. Let ei = (1,0,... ,0) G For all t > 0, let 

Yt = Xr + tei and define 

i?o = inf I / > 0, / f n>T-(dr(;, dx) = 1 
[ J[0,t] Jx=0 

Note that Rq may be null, in particular when r is a jumping time of the underlying 
point process N. It is easy to check that Rq can be expressed as a measurable 
function of Xr and !!>,-. Moreover, it is clear that = Yt/^Rg for all t > 0. 

So, Rq can be seen as the delay until the first point of the underlying process N 
after time r. Suppose that Rp, the delay until the {p + l)th point, is constructed 
for some p > 0 and let us show how i?p+i can be constructed. For t > Rp, let 
Zt = 0(X])^)+tei, where 9 : (xi,..., Xfc+i) i-G (0, xi,..., Xk) is a right shift operator 
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modelling the dynamics described by (2.3). Let us define 


Rp+i = inf <t>R 


rf(Z^) 


■p, 


' {Rp^Rp-\-t] dx—0 


n>T-(d?c, dx) = 1 > ■ 


(B.7) 


Note that for any p > 0, Rp+i cannot be null. It is coherent with the fact that the 
counting process (iVt)t>o only admits jumps with height 1. It is easy to check that 
Rp+i can be expressed as a measurable function of 9{X^^) and It is also 

= ^tAfip+i for all t > Rp. So, Rp+i can be seen as the delay 


clear that X, 


t/\ Rp^ 1 


XL can be 


until the (p + 2)th point of the process N after time r. By induction 
expressed as a function of Xr and n>T-, and this holds for Rp+i and ^Lp+i fo°- 

To conclude, remark that the process {Xl)t>o is a measurable function of Xt 
and all the Rp’s for p > 0. Thanks to the independence of the Poisson measure 
n, is independent of n>i-. Then, since {XL)t>o is a function of X^- and n>T-, 
{XL)t>o is independent of given X^- which concludes the first point. 

For Point (2), fix i > 1 and apply Point (1) with t = T^. It appears that 
in this case, i?o = 0 and Ri = A^. Moreover, Ri = Ai can be expressed as a 
function of d{Xr) and n>T-. However, OIXt) = (0, Ai_i,..., Ai_fc) and C Xf. 
Since t = Ti, n>p is independent of and so Ai is independent of X^_i given 
(Ai_i,..., Ai_/j). That is, (Ai)i>i forms a Markov chain of order k. 

Note that if Tg = 0 a.s. (in particular it is non-negative), then one can use the 
previous argumentation with r = 0 and conclude that the Markov chain starts one 
time step earlier, i.e. (Ai)i>o forms a Markov chain of order k. 

For (B.5l,i?i = Ai, defined by ( B.7| ), has the same distribution as the first 
point of a Poisson process with intensity X{t) = /(t, Ai_i,..., Ai_fc) thanks to the 
thinning Theorem. Hence, the transition measure of (Ai)i>i is given by (B.5). 

Now that {Xt)t>o is Markovian, one can compute its infinitesimal generator. 
Suppose that / is continuous and let (j) S The generator of {Xt)t>o is 

defined by Qip^s, ai,..., ak) = lim;i_j,o+ where 


Ph(j){s,ai,... ,ak) = E[(j){Xh)\Xo = (s, oi,..., Ofe)] 

= E (Xh) l{Ar([o,/t])=o} l-’iio = (s, Oi,..., ak)] 
+E [<(> (Xh) l{Ar([o.h])>o} = (s, ai,..., flfe)] 

= Eq + EyQ. 


The case with no jump is easy to compute. 


Eq = (j) {s + h, ai,..., Qk) {I - f {s, ai,..., ak) h) + o{h), (B.8) 

thanks to the continuity of /. When h is small, the probability to have more than 
two jumps in [0,h] is a o{h), so the second case can be reduced to the case with 
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exactly one random jump (namely T), 

E>o = E {Xh) l{iv([o,/i])=i} 1^0 = (s, oi,..., ttk)] + o{h) 

= E {9{Xo + T) + (h — T)ei) l{Arn[o,/i]={T}} = (s, ai,..., ak)\ + o{h) 

= E [(^ (0, s, Oi,..., afe_i) + o(l)) l{Arn[o,/i]={T}} = (s,ai,..., Cfc)] + o(h) 

= (/>(0,s,ai..., Ofe-i) (/ (s, oi,..., ttfe) h) + 0 (h ), (B.9) 


thanks to the continuity of (/) and /. Gathering 
of the generator gives (B.6l. 


and (B.91 with the definition 

□ 


B.2.2. Sketch of proof of Proposition [jTf 


Let N be the point process construct by Ogata’s thinning of the Poisson process U 
and Uk be as defined in Proposition |4.1[ By an easy generalisation of Proposition 
3.1| one can prove that on the event O of probability 1, where Ogata’s thinning is 
well defined, and where Tg < 0, Uk satisfies {Tpubini), (4.3) and on IR+ x ^ 
following system in the weak sense 


^fe+i 


, the 


^ j t/fe (dt, ds, da) + ( y n {dt, dx) ) Uk {t, ds, da) = 0, 


Uk {dt, 0, ds, dai,..., dafc_i) = 




/a/cGM \Jx—0 


n (dt, dx) ) Uk {t, ds, da), 


with da = doi x ... x da^ and initial condition t/*" = J(_To Aj^)- 

Similarly to Proposition |3.2| one can also prove that for any test function ip in 
Alc,b(Ry^), E [J (p(t, s,a)Gfe(t,ds,da)] and E [J s, a)(7fe(dt, s, da)] are finite 
and one can define Ufe(t,ds,da) and Wfc(dt, s,da) by, for all ip in 7V4c,b(Ry^), 


J ip{t,s,a)uk{t,ds, da) = E J ip{t,s,a)Uk{t,ds, da) 
for all t > 0, and 

J ip{t, s,a)uk{dt, s,da) = E 


ip{t,s,a)Uk{dt,s,da) 


for all s > 0. Moreover, Ufe(t,ds,da) and Mfc(dt, s,da) satisfy {VFubini) and one can 
define Ufc(dt, ds, da) = ds, da)dt = Mfc(dt, s, da)ds on K^, such that for any 
test function ip in AJc,b(Ey^)) 


J ip{t, s,a)uk{dt,ds,da) = E 



a)Uk{dt,ds,da) 


quantity which is finite. The end of the proof is completely analogous to the one of 
Theorem 13.31 
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B.3. Linear Hawkes processes 
B.3.1. Cluster decomposition 

Proposition B.3. Let g be a non negative LlM+) function and h a non negative 
L^(K_i_) function such that ||/i||i < 1. Then the branching point process N is defined 
as the set of all the points in all generations constructed as follows: 

• Ancestral points are Nanc distributed as a Poisson process of intensity g; 
No := Nanc can be seen as the points of generation 0. 

• Conditionally to Nanc, each ancestor a € Nanc gives birth, independently 
of anything else, to children points Ni^a according to a Poisson process of 
intensity h{. — a); Ni = UoGAr„„^-/Vi,a forms the first generation points. 

Then the construction is recursive in k, the number of generations: 

• Denoting Nk the set of points in generation k, then conditionally to Nk, 
each point x G Nk gives birth, independently of anything else, to children 
points Nk+i,x according to a Poisson process of intensity h{. — x); Nk+i = 
UxeNk^k+i.x forms the points of the {k + l)th generation. 

This construction ends almost surely in every finite interval. Moreover the intensity 
of N exists and is given by 

X{t,F^_)=g{t)+ j h{t-x)N{dx). 

Jo 

This is the cluster representation of the Hawkes process. When g = v, this has 
been proved in However up to our knowledge this has not been written for a 
general function g. 

Proof. First, let us fix some H > 0. The process ends up almost surely in [0,H] 
because there is a.s. a finite number of ancestors in [0, A]: if we consider the family of 
points attached to one particular ancestor, the number of points in each generation 
form a sub-critical Galton Watson process with reproduction distribution, a Poisson 
variable with mean J h < 1 and whose extinction is consequently almost sure. 
Next, to prove that N has intensity 

H(t) = g(t) + f h(t — x)N{dx), 

Jo 

we exhibit a particular thinning construction, where on one hand, N is indeed a 
branching process by construction as defined by the proposition and, which, on the 
other hand, guarantees that Ogata’s thinning project the points below H{t). We 
can always assume that h{0) = 0, since changing the intensity of Poisson process 
in the branching structure at one particular point has no impact. Hence H(t) = 
9(J) + J^h{t- x)N{dx). 

The construction is recursive in the same way. Fix some realisation H of a Poisson 
process on K^. 
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For Nanc^ project the points below the curve t —>• g(t) on [0, A], By construction, 
nc is a Poisson process of intensity g{t) on [0, A]. Note that for the identification 


(see Theorem B.ll I we just need to do it on finite intervals and that the ancestors 
that may be born after time A do not have any descendants in [0, A], so we can 
discard them, since they do not appear in H{t), for t < A. 

Enumerate the points in Nanc H [0, A] from Ti to T^Vq 


• The children of Ti, Ni^Ti, are given by the projection of the points of U 
whose ordinates are in the strip t !->■ {g{t),g{t) + h{t — Ti)]. As before, by the 
property of spatial independence of U, this is a Poisson process of intensity 
h{. — Ti) conditionally to Nanc- 

• Repeat until T/Vo oo > where are given by the projection of the points 

of n whose ordinates are in the strip t !->■ {g{t) + ^ ~ Ti),g{t) + 

— As before, by the property of independence of U, this is a 
Poisson process of intensity h{. — conditionally to Nanc and because 

the consecutive strips do not overlap, this process is completely independent 
of the previous processes {Ni^TiYs that have been constructed. 


Note that at the end of this first generation, Ni = Ure azotic-^ i.T consists of the 
projection of points of U in the strip t >->• {g{t),g{t) + ~ They 

therefore form a Poisson process of intensity h{t — Ti) = f h{t — u)Nanc{du) , 

conditionally to Nanc- 

For generation fc + 1 replace in the previous construction Nanc by IVfe and g{t) 
by g{t) + / dit — u)dNj{u). Once again we end up for each point x in Nk 

with a process of children Nk+i^x which is a Poisson process of intensity h{t — x) 
conditionally to Nk and which is totally independent of the other Nk+i^yS. Note 
also that as before, Nk+i = ^Jx^N^Nk+i^x is a Poisson process of intensity J h{t — 
u)Nk{du), conditionally to Nq, ..., Nk- 

Hence we are indeed constructing a branching process as defined by the propo¬ 
sition. Because the underlying Galton Watson process ends almost surely, as shown 
before, it means that there exists a.s. one generation Nk* which will be completely 
empty and our recursive contruction ends up too. 

The main point is to realize that at the end the points in N = A^^Nk are 
exactly the projection of the points in U that are below 

OO p OQ pt 

t ^ g{t) + E h{t — u)Nk{du) = g{t)+ ''^^ / h{t — u)Nk{du) 

fc = 0 “' k = 0''0 

hence below 

1g(t) + f h{t — u)N{du) = H{t). 

Jo 

Moreover H{t) is predictable. Therefore by Theorem 
H{t), which concludes the proof. □ 


B.ll 


N has intensity 
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A cluster process Nc, is a branching process, as defined before, which admits 
intensity = h{t) + f h{t — z)Nc{dz). Its distribution only depends on 

the function h. It corresponds to the family generated by one ancestor at time 0 in 
Proposition | B.3| Therefore, by Propositior j B.3| a Hawkes process with empty past 
{N- = 0) of intensity = g(t) + h{t — z)N{dz) can always be seen as 

the union of Nanc and of all the a + -/V“ for a G Nanc where the -/V“ are i.i.d. cluster 
processes. 

For a Hawkes process N with non empty past, iV_, this is more technical. Let 
f^anc be a Poisson process of intensity g on K+ and be a sequence of 

i.i.d. cluster processes associated to h. Let also 


^>0 


= Nanc U 



(B.IO) 


As we prove below, this represents the points in N that do not depend on 7V_. The 
points that are depending on 7V_ are constructed as follows independently of A^>o. 
Given IV_, let (A^?')denote a sequence of independent Poisson processes with 
respective intensities Xt{v) = h{v — T)1L(q Then, given N_ and , 

let jngjy be a sequence of i.i.d. cluster processes associated to h. The 

points depending on the past N- are given by the following formula as proved in 
the next Proposition: 


A<o = U I IJ A^i^ U I U P + NJN 


(B.ll) 


i TGAf- 


Proposition B.4. Let N = N<o U A^>o, where Ai>o and N<q are given by (B.IO I 


and (B.ll I. Then N is a linear Hawkes process with past given by N- and intensity 
on (0, oo) given by X{t, T^_) = g{t) + h{t — x)N{dx), where g and h are as in 
Proposition B.3\ 


Proof. Propositior B.3| yields that N^q has intensity 

XN,o{t,rZn = 9{t)+ f h{t-x)N>o{dx), (B.12) 

Jo 

and that, given A^_, for any T e N-, Nj^ = Ni U ^ bas intensity 


XpfT{t,iFfT) = h{t — T) + I h{t — x)N^{dx), 


(B.13) 


Moreover, all these processes are independent given N_. For any t > 0, one can 
note that J-f-° C Gt '■= V so N<o has intensity 


Aiv<o(L0t-)= E / h{t-x)N<o{dx) 


(B.14) 


TGAf- 
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on (0,+oo). Since this last expression is -predictable, by page 27 inl^, this is 
also Moreover, 7V<o and A^>o are independent by construction and, 

for any t > 0, C V . Hence, as before, N has intensity on (0, -l-oo) 

given by 

= + f h{t-x)N{dx). □ 

J —oo 


B.3.2. A general result for linear Hawkes processes 

The following proposition is a consequence of Theorem |3.3| applied to Hawkes pro¬ 
cesses with general past iV_. 


Proposition B.5. Using the notations of Theorem \3.S\ let N be a Hawkes process 
with past before 0 given by 7V_ of distribution Pq and with intensity on M+ given by 

rt- 


X{t,F^_) = pL+ f h{t — x)N{dx), 

J —OO 


where p, is a positive real number and h is a non-negative function with support in 
K+ such that J h < 1. Suppose that Pq is such that 

pO 


supE 

i>0 


h(t — x)N-{dx) 


< oo. 


(B.15) 


Then, the mean measure u defined in Proposition satisfies Theorem |5.5| and 
moreover its integral v{t,s) := f u(t,da) is a solution of the system (4.141-(4.151 

S 

where is the survival function of —Tq, and where <i> = is given by = 
-I-p , with given by (4.17) and <i>(l’p given by, 


V s, t > 0, 4)^^’p (t, s) = E 


h{t — z)N<o{dz) 


N<o {[t - s,t)) = 0 


. (B.16) 


Moreover, (4.20) holds. 


B.3.3. 


Proof of the general result of Proposition 


B.5 


Before proving Proposition ! B.S) we need some technical preliminaries. 

Events of the type {St- > s} are equivalent to the fact that the underlying 
process has no point between t — s and t. Therefore, for any point process N and 
any real numbers t, s > 0, let 


£t.,(Af) = {A^n[t-s,t) = 0}. (B.17) 

Various sets £t^s{N) are used in the sequel and the following lemma, whose proof is 
obvious and therefore omitted, is applied several times to those sets. 
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Lemma B.6. Let Y be some random variable and I{y) some eountable set of 
indices depending on Y. Suppose that {Xi)i(zi(Y) *'■5 ® sequence of random variables 
whieh are independent conditionally on Y. Let A(Y) be some event depending on 
Y and\/j G I{Y), Bj = Bj{Y,Xj) be some event depending on Y and Xj. Then, 
for any i G I{Y), and for all sequence of measurable functions {fi)i£i{Y) such that 
the following quantities exist, 



A#B 

= E 

E E[/,(F,A,)|F,H,] 

A#B 

_ieiiY) 



iei(Y) 



where E [/.(F, X,) | F, B,] = <^nd A#B = A(F) n (n,e/(y) sf . 

The following lemma is linked to Lemma |4.2| 


Lemma B.7. Let N be a linear Hawkes proeess with no past before time 0 (i.e. 
N- = %) and intensity on (0,oo) given by X{t,iFl^) = g{t) + h{t — x)N{dx), 
where g and h are as in Proposition B.3\ and let for any x,s >0 


\ LfW = E 

f h{x — z)N{dz) 
Jo 


= (TV)), 



Then, for any x,s>0, 

Lf\x)= r (h(z) + L^"(z))G^''(z) 5 (a:-z)dz, 
J sAx 


and 


p(x—s)\/0 px 

logiGf\x))= G':A(x-z)g{z)dz- giz)dz. (B.19) 

Jo Jo 


(B.18) 


In particular, {L^^,G^'V in V x L°° and is a solution of (4.111-(4.121. 


Proof. The statement only depends on the distribution of N. Hence, thanks to 
Proposition 


B.4 


it is sufficient to consider N = Nanc U (LiveNa ,,V + N^). 


Let us show (B.18I. First, let us write = E [J2x&n ~ • 


and note that (x) = 0 if a; < s. The following decomposition holds 


II 

..c 

E 

h{x-V)+ E Kx-V-W)\ 

S.AN) 


VGAfa^c 

\ we NY / 



According to Lemma 


B.6 and the following decomposition. 




^x,s (Nanc) n 



(B.20) 
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let us denote Y = Name, Xy = and By = £x-y,s{NY) for all V S iVanc- Let 
us fix P € Nanc and compute the conditional expectation of the inner sum with 
respect to the filtration of Nanc which is 


E 

h{x-V-W) 

Y,By 

= E 

E h{{x-V)-W) 

£x-V,a{Nc) 


wgny 



.weN^ 

. 


= L’:’>^{x-V), (B.21) 


since, conditionally on Nanc, N^ has the same distribution as Nc which is a linear 
Hawkes process with conditional intensity = h(t) + h(t — z)Nc{dz). 

Using the conditional independence of the cluster processes with respect to Nanc, 
one can apply Lemma | B.6| and deduce that 


Ll^\x)=¥. 

E {K^-v) + l^y\^-v)) 

1- 


-VeN^„c 

_ 


The following argument is inspired by Mollei'l^. For every V € Nanc, we say that V 
has mark 0 if P has no descendant or himself in [x — s, x) and mark 1 otherwise. Let 
us denote Na„a the set of points with mark 0 and N^^c = Nanc \ Nanc hor any P G 
Nanc, we have P (P G Nanc\^an<^ = {x— V)t[x-s,x)'={V), and all the marks are 

chosen independently given Nanc- Hence, N^nc and N^nc are independent Poisson 
processes and the intensity of is given by X{v) = g{v)GG^{x — 

Moreover, the event {N^nc = 0} can be identified to £a;,s(.^)and 


Lf^ix) = E 


E {Hx-v)+l’:’\x-v)) 

1- 

II 

O 

E 




f {h{x-w)+ L'l’^ix - w)) g{w)G’l'’^{x - w)l[ 3 ,_s,x)‘=(^c)du; 

J —OO 


> —OO 

r*(a: —s)V 0 


n\x — s)yyj 

/ {h{x — w) + LG^{x — w)) G^’^(x — w)g{w) dw, 

Jo 


where we used the independence between the two Poisson processes. It suffices to 
substitute w hy z = x — w in the integral to get the desired formula. Since is 
bounded, it is obvious that is V". 

Then, let us show ( B.19[ ). First note that if a; < 0, Gf^{x) = 1. Next, following 
( B.20D one has Gf^{x) = E IlxeAfa^c ■ This is also 


Gl^\x) = E 


= E 


lAr„„,n[a:-s,x)=0 '^e^_v,sNY) 

V&Na-non[x-S,xY 

V&Na-non[x-S,xY 
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by conditioning with respect to Nanc- Since Nanc C\ [x — s,x) is independent of 
Nanc n [a; - s, a:)°, this gives 

= exp(- / g{z)dz)E exp ( f log(G^'*(a; - z))7Vanc(dz) 

Jx—s \J[x—s,x)^ 

This leads to log(Gf’'‘(a:)) = - J^_^g(z)dz+J^^_^^^yiG'^'^{x-z)-l)g{z)dz, thanks 
to Campbell’s Theorem 1^. Then, .19[) clearly follows from the facts that if z > 


> 0 then Gg’^{x — z) = 1 and g{z) = 0 as soon as z < 0. 


□ 


Proof of Lemma |4.2| In turn, we use a Banach fixed point argument to prove 
that for all s > 0 there exists a unique couple {Lg, Gg) G L^(R+) x L°°(K+) solution 
to these equations. To do so, let us first study Equation (4.111 and define Tg.s : 

L“(M+) ^ L“(M+) by 7b,.(/)(x) := exp f(x - z)h{z)dz - h(z)dz^ . 

The right-hand side is well-defined since h G and / G L°°. Moreover we have 


TGAf)i^) ^ e 




(x —s)V0 


h(z)dz 


This shows that Tg,s maps the ball of radius 1 of into itself, and more precisely 
into the intersection of the positive cone and the ball. We distinguish two cases: 

X 

— If X < s, then Ta,s(f)(^) = 6xp(— J h(z)dz) for any /, thus, the unique fixed 

0 

X 

point is given by Gs : a: i—> exp(— f h(z)dz), which does not depend on s > a;. 

0 

— And if a: > s, the functional Tg,s is a fc—contraction in {/ G ||/||l~ < 

OO 

1}, with k < f h(z)dz < 1, by convexity of the exponential. More precisely, using 
0 

that for all x,y, \e^ — e^| < — y\ we end up with, for ||/||, ||^||l°° ^ 1, 


— j h{z)(iz J h{z)dz 

\TG,gif){x)-TG^g{g){x)\ < e « e« II/-5 ||l° 


r{x-s) 


h{z)dz 


<II/-5||l“^ h{z)dz. 


Hence there exists only one fixed point Gg that we can identify with Gg’^ given 
in Proposition B.7 and G^’^ being a probability, Gg takes values in [0,1]. 

Analogously, we define the functional T^^g : L^(IR+) —> L^(IR_|_) by TL^sif)ix) := 
Is/\x A A) + fA)) Gg{z)h{x — z) dz, and it is easy to check that T^^g is well-defined 
as well. We similarly distinguish the two cases: 

— If a; < s, then the unique fixed point is given by Lg{x) = 0. 

OO 

— And if a: > s, thus Tj^ g is a fc—contraction with k < f h{y)dy < 1 in L^((s,oo)) 

0 

since ||Gs||l=o < 1 : 
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\\TL,s{f) - Tl,s{9)\\l^ = I \ I (/(^) “ 9 {z))Gs{z)h{x - z)dz\dx 

S S 

oo oo 

< IIGsIIloo / / \ f{z) - g{z)\h{x - z)dxdz 

S V 

oo 

= \\Gs\\L--\\f - 9\\L^as,oo)) J h{y)dy. 

0 

In the same way, there exists only one fixed point given by Proposition]] 


B.7 In particular Ls{x < s) = 0. 

Finally, as a consequence of Equation (|4.12 1 we find that if Ls is the unique 


( h{y) dy)^ 

fixed point of Tl,s, then ^ therefore Ls is uniformly 


bounded in with respect to s. 


l-/g“ h(y) dy 


Lemma B.8. Let N he a linear Hawkes proeess with past before time 0 given by 
N- and intensity on (0,oo) given by X{t,jzff_) = y + h{t — x)N{dx), where y 
is a positive real number and h is a non-negative funetion with support in K+, sueh 
that ||h||/,i < 1. If the distribution of N- satisfies (B.15I then is satisfied. 


Proof. For all t > 0, let A(t) = E[A(t,Jq^)] . By Proposition B.4 A(t) = 
E y + Jq h{t — x)N:^o{dx) +E J^^h{t — x)N<o{dx) which is possibly infinite. 

Let us apply Propositioi j B.7| with g = y and s = 0, the choice s = 0 implying 
that £t,o{XI>o) is of probability 1. Therefore 


E 


9' 


h{t — x)A^>o(dx) 


y(^ld-J {h{x) + Lo(a:))dx^ , 


B.7 


Hence E 


where (Lp, Gp = 1) is the solution of Lemma 4.2 for s = 0, by identification of 
Proposition 


On the other hand, thanks to Lemma | B.9| we have 


y + Jn h{t - x)Nyo{dx) < /x(l + ||/i||ii + ||Lo||ii). 


E 


f h{t — x)N<o{dx) =E (h{t — T)-\- f [hft — x)Loft — x)]h{x — T)da 

J — OO . N ' ^ 


Since all the quantities are non negative, one can exchange all the integrals and 
deduce that 


E 


h{t — x)N<oidx) 


<M(l + ||h|Ui + ||Lp|Ui), 


with M = supoQ E h{t — x)iV_(da:) which is finite by assumption. Hence, 
X{t) < {y-\- M){1 + ||h||ii + ||Lp|| 2 .i), and therefore is satisfied. □ 
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Proof of Proposition B.5 First, by Proposition B.4 

>s] = 


/r + E 


h(t — z)A^>o(dz) 


L^o 

r rt- 


£tAN) 


— /r+E 


h{t — z)A^>o(dz) 


£tAN>o) 


+ E 
+E 


— oo 
■ rt- 


h(t — z)N<Q{dz) 
h(t — z)N<o{dz) 


AAN) 

£tAN<o) 


By Lemma B.7 we obtain E [A(t, St- > s] = + s). Iden¬ 

tifying by Lemma Lg = LA and Gg = G^’^, we obtain 

¥.[\ArA)\St->s\=<^^+At,s) + <^tf,As). 

Hence s) = E [A(t, rA) \ St- > s] • 

Lemma | B.8| ensures that the assumptions of Theorem |3.3| are fulfilled. Let u 


and pp’^ = Pa.Po defined accordingly as in Theorem 3.3 With respect to the 


PDE system, there are two possibilities to express E [A(t,7^)l{Sj_>s}]. The first 
one involves pa,Po and is E 


whereas the second one involves 


and is s)P {St- > s). 

This leads to ff°° Pp’f{t,x)u{t,dx) = ^pf{t,s) ff°°u(t,dx), since u(t,ds) is 
the distribution of St-. Let us denote v(t, s) = u{t, dx): this relation, together 


with Equation (3.10) for u, immediately gives us that v satisfies Equation (4.14) 
with J) = Moreover, u{t, dx) = 1, which gives us the boundary condition 


in (4.15). 


B.3.4. Study of the general case for p in Proposition 


B.5 


Lemma B.9. Let consider h a non-negative function with support in K+ such that 


J h < 1, N- a point process on K_ with distribution Pq and iV<o defined by (B.ll I. 
If^'l^P^{t,s):=E jA^h{t - z)N<o{dz) £tA^<o) , for all s,t > 0, then, 




E {h{t-T)+Kg{t,T)) 

£t,s{N<o) 

T^N- 



(B.22) 


where Kg{t,u) is given by (4.13). 

Proof. Following the decomposition given in Proposition! B.4[ one has 


s) — E 


E Mi-?’) 


tgn. 


+ E (Ht-V)+ E Ht-V-W) 
VeNf V ' 


£t,s{N<o) 
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where 5t,s(iV<o) = flr'eAf- £t-v' )) ■ Let us fix 

T G N_, V G NJ' and compute the conditional expectation of the inner sum with 
respect to N_ and N'[. In the same way as for (B.211 we end up with 




since, conditionally on N_ and Nf, has the same distribution as Nf.. Using 

the conditional independence of the cluster processes ^hh respect 

to one can apply Lemma B.6 with Y = and 

^{Ty) = ^c’’^ and deduce that 




-1 

E 

E h{t-V-W) 



_WGNf''' 



$^,P„(Ls) =E 

E 

ih{t-T)+ E {h{t-V) + L':^’^{t-V))] 

£t,siN<o) 


TGiV_ 

V J 



F :=E 


N-,Al^ 


veNf 



Let us fix r G N- and compute the conditional expectation of the inner sum with 
respect to N- which is 


(B.23) 


where ) n (flv'eAf?’ ^t-v)) ■ For every V G iVf, we say that 

V has mark 0 if U has no descendant or himself in [t — s, t) and mark 1 otherwise. 
Let us denote the set of points with mark 0 and \ N'^’^. 

For any V G iVf, P (u G iVf) = G^''(t-U)ll[t_^,t)c(U) and all the marks 

are chosen independently given N{ . Hence, ’ and ’ are independent Poisson 
processes and the intensity of is given by X{v) = h{v — T)l[o_oo)(^’)Gg’^(t — 

Moreover, is the event = 0|, so 


F = E 


iV_ 




= f [h{t-v) + L'^’^{t-v)]h{v-T)l[o^^){v)G'^’'\t-v)tit-s,tr{v)dv 
J —oo 

= Ks{t,T). 

Using the independence of the cluster processes, one can apply Lemma | B.6| with 
Y = N- and Xt = (^Ni , {Nc’^)veN:['^ (B.22I clearly follows. □ 


Lemma B.IO. Under the assumptions and notations of Proposition \ B.5 and 
4-2 the function 4*^ p of Proposition B.5 can he identified with (4.181 


Lemma 


under (M^) and with (4.191 under (M^_) and (B.15I is satisfied in those two 


cases. 
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Proof. Using 


Lemma 


B.9 


we have = E (Ht - T) + Ks{t,T)) £t,siN<o) 


Under . On the one hand, for every t > 0, 


E 


/ h{t — x)N-{dx) 

d —OO 


= E[/i(f-To)] 

/ O poo 

Ht-to)fo{to)dto<\\fo\\L'^ / Hy)dy, 

-OO J 0 


hence Pq satisfies (B.15I. On the other hand, since N- is reduced to one point Tg, 


[(^(^-Tl)) + i^s(t,'ro))l£^ ^( 7 V<o)] , using the definition 
of the conditional expectation. First, we compute P(5t^s(-^<o|Tb)- To do so, we use 
the decomposition £t^s{N<o) = {Tg < t-s}n£t,s{Ni°)^ (nygAr;ro £t-v,s{N^°'^)'j 
and the fact that, conditionally on N'[° , for all V € N^°, iVj°T ]gag same 
distribution as Nr to deduce that 


® [^U,dJV<o)| Tg] - lTo<t-sE 




E 


n Gs{t-v) 

veN^°n[t-s,t)‘= 


Tn 


because the event £t^s{M°) involves N^° fl [t — s,t) whereas the product involves 
N^° n [< — s,t)°, both of those processes being two independent Poisson processes. 
Their respective intensities are A(a:) = h{x — 2o)l[(t-s)vo.t)(^) uud A(a;) = h{x — 
Tg)ll[o^p_s)vg)(a:), so we end up with 




(“ /t-s “ Tg)l[o_^)(a::)dx 


n G,{t-V) 
v^N'^°n[t-s,tY 


Tn 


exp (^- jf [1 -Gs{t- a;)] h{x - To)dx^ 


The product of these two last quantities is exactly q{t,s,To) given by (4.13). Note 
that q{t, s, Tq) is exactly the probability that Tg has no descendant in [t — s, t) given 
Tg. Hence, P (5t,s(-/V<g)) = g(t, s, tg)/g(tg)dtg and ( |4.18| ) clearly follows. 

Under On the one hand, for any t > 0, 


E 


h{t — x)N-{dx) 


= E 


/ O 1 poo 

h{t — a:)ada; < a h{y)dy, 

-OO . Jo 


hence Pg satisfies (B.15I. On the other hand, since we are dealing with a Poisson 


process, we can use the same argumentation of marked Poisson processes as in the 


proof of Lemma B.7 For every T G fV_, we say that T has mark 0 if T has no 
descendant or himself in [t — s,t) and mark 1 otherwise. Let us denote the set 
of points with mark 0 and N}_ = N- \ . For any T G 7V_, we have 

p(tg iv° I iv_) = g(t,s,r)t)c(r). 
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and all the marks are chosen independently given N_. Hence, and N}_ are 
independent Poisson processes and the intensity of is given by 

A(z) = atz<Q q{t, s, 

Moreover, £t,B{N<o) = =0}. Hence, 


^ (h(t-T)+Kg(t,T)) 

Nf=iD 

TGN° 



^-,Poit,s)=E 

which gives ( |4.19[ ) thanks to the independence of and N}_. 
B.3.5. Proof of Propositions |J.3| and\4-4\ 


Sinc e we already proved Proposition | B.5| and Lemma | B.10[ to obtain Proposi¬ 


tion 


4.3 


it only remains to prove that G L 


'p), to ensure uniqueness of the 
solution by Remark |4.1[ To do so, it is easy to see that the assumption h G L°°(K+) 
combined with Lemma 4.2 giving that Gs G [0,1] and Lg G L^( 
q and Kg are in In turn, this implies that j 

(4.191 is in L°“(K_|_), which concludes the proof of Proposition 4.3 


_) ensures that 


in both (4.181 and 


Proof of Proposition |4.4| The method of characteristics leads us to rewrite the 


solution V of (4.141-(4.151 by defining /*" = u*" on K+, /*" = 1 on M_ such that 
f/“(s-t)e when s > t 

(B-24) 


when t > s. 

Let be the distribution of the past given by and Tg ^ U{[—M—1,—M]). 

By Proposition 


4.3 


let vm be the solution of System (4.141-(4.15) with <i) = J): 


lj.,h 


and r:™ = v'fl}, (i.e. the survival function of a uniform variable on [—M — 1, —M]). 
Let also v'^ be the solution of System (4.14|-(4.151 with $ = <i)pM and v™ = 1, 


and Uoo the solution of (4.211-(4.22). Then 


\\VM - ^'°°||i“((0,T)x(0,S)) < \\VM - 1’mIIl“((0,T)x(0,S)) + II^^M “ '1'°° lU” ((0,T) x (0,S)) ■ 


By definition of v'fl}, it is clear that v^f}{s) = 1 for s < M, so that Formula (B.241 im¬ 


plies that VM(t, s) = s) as soon as s—t < M and so ||r’M— i’mIU“((o.t)x(o,S)) = 
0 as soon as M > S'. 

To evaluate the distance \\v'^ — u°“||icx,((g yjxio.S))) if remains to prove that 

— f (y,s — t-\-y) dy 

e “’’’o —> 1 uniformly on (0,T) x (0, S) for any T > 0, S > 0. For 

this, it suffices to prove that pM{t,s) —> 0 uniformly on (0,T) x (0,S). Since q 
given by (4.131 takes values in [exp(—2||/i||ii), 1], (4.181 implies 

|-0A(i—s) 


r\J/ 

S) < — 


{hft — to) + Kg{t, to)) l[_M-i.-M](fo)dto 

/°^^‘”'’^exp(-2||/i||ii)l[_M-i-M](fo)dto 
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Since HGsHloo < 1, Lg and h are non-negative, it is clear that 

|.+oo 

Ksit,to) < / [h{t - x) + Ls{t - x)]h{x - to)dx, 

Jo 


and so 


— Ad ^-j-cxD / p — Ad 

Ks{t,to)dto< / [h{t - x) + Ls{t - x)] i / h{x-to)dto 
-M -1 Jo \J-M -1 y 

roc /*+oo 

< / h{y)dy / [h{t — x)Ls{t — x)]dx 

Jm Jo 


dx 


! M 
poo 


< 


poo 

/ h{y)dy[\\h\\L^+ \\Ls\\lA- 
Jm 


Hence, for M large enough (t, s) < ^ ^^elp(_ 21 \h \\°^^ 0, uniformly 

in (t, s) since Lg is uniformly bounded in L^, which concludes the proof. 


B.4. Thinning 

The demonstration of Ogata’s thinning algorithm uses a generalization of point 
processes, namely the marked point processes. However, only the basic properties 
of simple and marked point processes are needed (see ^ for a good overview of point 
processes theory). Here {J-t)t>o denotes a general filtration such that C for 
all t > 0, and not necessarily the natural one, i.e. {F^)t>o- 

Theorem B.ll. Let H &e o {J-t)-Poisson process with intensity 1 on Let 
X{t,J^t-) be CL non-negative {Tt)-predictable process which is L]^^ a.s. and define the 
point process N by N (C) = l[o,A(t,.Ft-)] i^) n (dt x dz), for all C G B (K+). 

Then N admits X{t,J^t-) as a {Tt)-predictable intensity. Moreover, if X is in fact 
-predictable, i.e. X{t,Ft-) = X{t,FfL), then N admits A(t, as a 
predictable intensity. 

Proof. The goal is to apply the martingale characterization of the intensity (Chap¬ 
ter H, Theorem 9 in^^. We cannot consider H as a point process on K+ marked in 
K+ (in particular, the point with the smallest abscissa cannot be defined). However, 
for every fc G N, we can define , the restriction of H to the points with ordinate 
smaller than k, by H^^^ (C) = /^ H (d< x dz) for all G G S (M+ x [0, k]). Then H*^^^ 
can be seen as a point process on IR_|_ marked in := [0, k] with intensity kernel 
l.dz with respect to {JFt)- In the same way, we define by 

(G) = f 1.6[0.A(t.^,_)] {dt X d 0 ) for all CgB (K+) . 

J Cx]R+ 

Let V{Et) be the predictable a-algebra (see page 8 of 131). 

Let us denote £k = B{\fi,k]) and Vk {Tt) = V {jFt) ® £k the associated marked 
predictable cr-algebra. 
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For any fixed z in E, {(MjW) € K+ x ft such that X{u,Eu-) (w) > z} GV {Et) 
since A is predictable. If F^ = {{u,uj,z) G M+ x ft x Ek, X{u,Eu-) (w) > z}, then 

Ffe = n U {(rt, w) G M_|_ X fi, X(u,Eu-) (w) > g} x 

neN* qeQ+ 



So, Ffe G Vk (Et) and lze[o.A(ii.JF„_)]n£;fc is "Pfc (J^t)-measurable. Hence, one can apply 
the Integration Theorem (Chapter VIII, Corollary 4 inl^. So, 

(Xt)t>o ■={ / 1zg[o,a(«, :*=■„_)] {du x dz) ) is a (J't)-local martingale 


0 d Ek 


t>0 


where (du x dz) = H^^^ (du x dz) — dzdtt. In fact, 

pt 

Xt = — / min (A(m, Eu-), k) du. 

Jo 

Yet, (respectively /J min (A(u, fc) drt) is non-decreasingly converging 

towards Nt (resp. /J A(u, J^u-)dit). Both of the limits are finite a.s. thanks to the 
local integrability of the intensity (see page 27 of Thanks to monotone conver¬ 
gence we deduce that (^Nt — Jg X(u, Eu-)du^ is a (Jt)-local martingale. Then, 

thanks to the martingale characterization of the intensity, Nt admits X(t,Et-) as 
an (J^t)-intensity. The last point of the Theorem is a reduction of the filtration. 
Since X(t,Et-) = X(t,E^), it is a fortiori -progressive and the desired result 
follows (see page 27 inl^. □ 


This final result can be found in 


Proposition B.12 (Inversion Theorem). 

Let N = be a non explosive point process on IR+ with (E^^) -predictable 

intensity Xt = X(t,E^). Let {Cfn}„>o ^6 ® sequence of i.i.d. random variables with 
uniform distribution on [0,1]. Moreover, suppose that they are independent of E^. 
Denote Qt = a (Un,Tn < t). Let N be an homogeneous Poisson process with inten¬ 
sity 1 on independent of Eoo V Qoo- Define a point process N on by 


N((a,b]xA)^y2l^a,b](Tn)tA{Ur,X(Tn,E^^_))+[ [ N(dtxdz) 

n>0 J{a,b]JA-[0,\{t,Tf'_)] 

for every 0 < a <b and A C K+. 

Then, N is an homogeneous Poisson process on with intensity 1 with respect 
to the filtration ('Ht)t>o ~ 'J Qt'J E^^ 


t>o 
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